+
+## 📝 Meissonic Updates and Family Papers
+
+- [MaskGIT: Masked Generative Image Transformer](https://arxiv.org/abs/2202.04200) [CVPR 2022]
+- [Muse: Text-To-Image Generation via Masked Generative Transformers](https://arxiv.org/abs/2301.00704) [ICML 2023]
+- [🌟][Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis](https://arxiv.org/abs/2410.08261) [ICLR 2025]
+- [Bag of Design Choices for Inference of High-Resolution Masked Generative Transformer](https://arxiv.org/abs/2411.10781)
+- [Di[𝙼]O: Distilling Masked Diffusion Models into One-step Generator](https://arxiv.org/abs/2503.15457) [ICCV 2025]
+- [🌟][Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model](https://arxiv.org/abs/2505.23606)
+- [DC-AR: Efficient Masked Autoregressive Image Generation with Deep Compression Hybrid Tokenizer](https://arxiv.org/pdf/2507.04947) [ICCV 2025]
+- [MDNS: Masked Diffusion Neural Sampler via Stochastic Optimal Control](https://arxiv.org/abs/2508.10684)
+- [Lavida-O: Elastic Large Masked Diffusion Models for Unified Multimodal Understanding and Generation](https://arxiv.org/abs/2509.19244)
+- [🌟][Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding](https://arxiv.org/abs/2510.06308)
+- [Token Painter: Training-Free Text-Guided Image Inpainting via Mask Autoregressive Models](https://arxiv.org/abs/2509.23919)
+- [TR2-D2: Tree Search Guided Trajectory-Aware Fine-Tuning for Discrete Diffusion](https://arxiv.org/abs/2509.25171)
+- [OneFlow: Concurrent Mixed-Modal and Interleaved Generation with Edit Flows](https://arxiv.org/abs/2510.03506)
+- [Diffuse Everything: Multimodal Diffusion Models on Arbitrary State Spaces](https://arxiv.org/abs/2506.07903) [ICML 2025]
+- [Towards Better & Faster Autoregressive Image Generation: From the Perspective of Entropy](https://arxiv.org/abs/2510.09012) [NeurIPS 2025]
+- [🌟][From Masks to Worlds: A Hitchhiker's Guide to World Models](https://arxiv.org/abs/2510.20668)
+- [Soft-Di[M]O: Improving One-Step Discrete Image Generation with Soft Embeddings](https://arxiv.org/abs/2509.22925)
+
+- More papers are coming soon!
+See [MeissonFlow Research](https://huggingface.co/MeissonFlow) (Organization Card) for more about our vision.
+
+
+
+
+## 🚀 Introduction
+
+Meissonic is a non-autoregressive mask image modeling text-to-image synthesis model that can generate high-resolution images. It is designed to run on consumer graphics cards.
+
+
+
+**Key Features:**
+- 🖼️ High-resolution image generation (up to 1024x1024)
+- 💻 Designed to run on consumer GPUs
+- 🎨 Versatile applications: text-to-image, image-to-image
+
+## 🛠️ Prerequisites
+
+### Step 1: Clone the repository
+```bash
+git clone https://github.com/viiika/Meissonic/
+cd Meissonic
+```
+
+### Step 2: Create virtual environment
+```bash
+conda create --name meissonic python
+conda activate meissonic
+pip install -r requirements.txt
+```
+
+### Step 3: Install diffusers
+```bash
+git clone https://github.com/huggingface/diffusers.git
+cd diffusers
+pip install -e .
+```
+
+## 💡 Inference Usage
+
+### Gradio Web UI
+
+```bash
+python app.py
+```
+
+### Command-line Interface
+
+#### Text-to-Image Generation
+
+```bash
+python inference.py --prompt "Your creative prompt here"
+```
+
+#### Inpainting and Outpainting
+
+```bash
+python inpaint.py --mode inpaint --input_image path/to/image.jpg
+python inpaint.py --mode outpaint --input_image path/to/image.jpg
+```
+
+### Advanced: FP8 Quantization
+
+Optimize performance with FP8 quantization:
+
+Requirements:
+- CUDA 12.4
+- PyTorch 2.4.1
+- TorchAO
+
+Note: Windows users install TorchAO using
+```shell
+pip install --pre torchao --index-url https://download.pytorch.org/whl/nightly/cpu
+```
+
+Command-line inference
+```shell
+python inference_fp8.py --quantization fp8
+```
+
+Gradio for FP8 (Select Quantization Method in Advanced settings)
+```shell
+python app_fp8.py
+```
+
+#### Performance Benchmarks
+
+| Precision (Steps=64, Resolution=1024x1024) | Batch Size=1 (Avg. Time) | Memory Usage |
+|-------------------------------------------|--------------------------|--------------|
+| FP32 | 13.32s | 12GB |
+| FP16 | 12.35s | 9.5GB |
+| FP8 | 12.93s | 8.7GB |
+
+## 🎨 Showcase
+
+
+
+
"A pillow with a picture of a Husky on it."
+
+
+
+
+
"A white coffee mug, a solid black background"
+
+
+## 🎓 Training
+
+To train Meissonic, follow these steps:
+
+1. Install dependencies:
+ ```bash
+ cd train
+ pip install -r requirements.txt
+ ```
+
+2. Download the [Meissonic](https://huggingface.co/MeissonFlow/Meissonic) base model from Hugging Face.
+
+3. Prepare your dataset:
+ - Use the sample dataset: [MeissonFlow/splash](https://huggingface.co/datasets/MeissonFlow/lemon/resolve/main/0000.parquet)
+ - Or prepare your own dataset and dataset class following the format in line 100 in [dataset_utils.py](./train/dataset_utils.py) and line 656-680 in [train_meissonic.py](./train/train_meissonic.py)
+ - Modify [train.sh](./train/train.sh) with your dataset path
+
+4. Start training:
+ ```bash
+ bash train/train.sh
+ ```
+
+Note: For custom datasets, you'll likely need to implement your own dataset class.
+
+
+## 📚 Citation
+
+If you find this work helpful, please consider citing:
+
+```bibtex
+@article{bai2024meissonic,
+ title={Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis},
+ author={Bai, Jinbin and Ye, Tian and Chow, Wei and Song, Enxin and Chen, Qing-Guo and Li, Xiangtai and Dong, Zhen and Zhu, Lei and Yan, Shuicheng},
+ journal={arXiv preprint arXiv:2410.08261},
+ year={2024}
+}
+```
+
+## 🙏 Acknowledgements
+
+We thank the community and contributors for their invaluable support in developing Meissonic. We thank apolinario@multimodal.art for making Meissonic [Demo](https://huggingface.co/spaces/MeissonFlow/meissonic). We thank @NewGenAI and @飛鷹しずか@自称文系プログラマの勉強 for making YouTube tutorials. We thank @pprp for making fp8 and int4 quantization. We thank @camenduru for making [jupyter tutorial](https://github.com/camenduru/Meissonic-jupyter). We thank @chenxwh for making Replicate demo and api. We thank Collov Labs for reproducing [Monetico](https://huggingface.co/Collov-Labs/Monetico). We thank [Shitong et al.](https://arxiv.org/abs/2411.10781) for identifying effective design choices for enhancing visual quality.
+
+
+---
+
+
diff --git a/Meissonic/app.py b/Meissonic/app.py
new file mode 100644
index 0000000000000000000000000000000000000000..405c7a647f69284d37a28aa20d43c65874dc446c
--- /dev/null
+++ b/Meissonic/app.py
@@ -0,0 +1,149 @@
+import os
+import sys
+sys.path.append("./")
+
+import torch
+from torchvision import transforms
+from src.transformer import Transformer2DModel
+from src.pipeline import Pipeline
+from src.scheduler import Scheduler
+from transformers import (
+ CLIPTextModelWithProjection,
+ CLIPTokenizer,
+)
+from diffusers import VQModel
+import gradio as gr
+
+
+device = 'cuda' if torch.cuda.is_available() else 'cpu'
+
+model_path = "MeissonFlow/Meissonic"
+model = Transformer2DModel.from_pretrained(model_path, subfolder="transformer")
+vq_model = VQModel.from_pretrained(model_path, subfolder="vqvae")
+# text_encoder = CLIPTextModelWithProjection.from_pretrained(model_path, subfolder="text_encoder")
+text_encoder = CLIPTextModelWithProjection.from_pretrained( #more stable sampling for some cases
+ "laion/CLIP-ViT-H-14-laion2B-s32B-b79K"
+ )
+tokenizer = CLIPTokenizer.from_pretrained(model_path, subfolder="tokenizer")
+scheduler = Scheduler.from_pretrained(model_path, subfolder="scheduler")
+pipe = Pipeline(vq_model, tokenizer=tokenizer, text_encoder=text_encoder, transformer=model, scheduler=scheduler)
+pipe.to(device)
+
+MAX_SEED = 2**32 - 1
+MAX_IMAGE_SIZE = 1024
+
+
+def generate_image(prompt, negative_prompt, seed, randomize_seed, width, height, guidance_scale, num_inference_steps, progress=gr.Progress(track_tqdm=True)):
+ if randomize_seed or seed == 0:
+ seed = torch.randint(0, MAX_SEED, (1,)).item()
+ torch.manual_seed(seed)
+
+ image = pipe(
+ prompt=prompt,
+ negative_prompt=negative_prompt,
+ height=height,
+ width=width,
+ guidance_scale=guidance_scale,
+ num_inference_steps=num_inference_steps
+ ).images[0]
+
+ return image, seed
+
+# Default negative prompt
+default_negative_prompt = "worst quality, low quality, low res, blurry, distortion, watermark, logo, signature, text, jpeg artifacts, signature, sketch, duplicate, ugly, identifying mark"
+css = """
+#col-container {
+ margin: 0 auto;
+ max-width: 640px;
+}
+"""
+
+examples = [
+ "Modern Architecture render with pleasing aesthetics.",
+ "An image of a Pikachu wearing a birthday hat and playing guitar.",
+ "A statue of a lion stands in front of a building.",
+ "A white and blue coffee mug with a picture of a man on it.",
+ "A metal sculpture of a deer with antlers.",
+ "A bronze statue of an owl with its wings spread.",
+ "A white table with a vase of flowers and a cup of coffee on top of it.",
+ "A woman stands on a dock in the fog.",
+ "A lion's head is shown in a grayscale image.",
+ "A sculpture of a Greek woman head with a headband and a head of hair."
+]
+
+with gr.Blocks(css=css) as demo:
+ with gr.Column(elem_id="col-container"):
+ gr.Markdown("# Meissonic Text-to-Image Generator")
+ with gr.Row():
+ prompt = gr.Text(
+ label="Prompt",
+ show_label=False,
+ max_lines=1,
+ placeholder="Enter your prompt",
+ container=False,
+ )
+ run_button = gr.Button("Run", scale=0, variant="primary")
+ result = gr.Image(label="Result", show_label=False)
+ with gr.Accordion("Advanced Settings", open=False):
+ negative_prompt = gr.Text(
+ label="Negative prompt",
+ max_lines=1,
+ placeholder="Enter a negative prompt",
+ value=default_negative_prompt,
+ )
+ seed = gr.Slider(
+ label="Seed",
+ minimum=0,
+ maximum=MAX_SEED,
+ step=1,
+ value=0,
+ )
+ randomize_seed = gr.Checkbox(label="Randomize seed", value=True)
+ with gr.Row():
+ width = gr.Slider(
+ label="Width",
+ minimum=256,
+ maximum=MAX_IMAGE_SIZE,
+ step=32,
+ value=1024,
+ )
+ height = gr.Slider(
+ label="Height",
+ minimum=256,
+ maximum=MAX_IMAGE_SIZE,
+ step=32,
+ value=1024,
+ )
+ with gr.Row():
+ guidance_scale = gr.Slider(
+ label="Guidance scale",
+ minimum=0.0,
+ maximum=20.0,
+ step=0.1,
+ value=9.0,
+ )
+ num_inference_steps = gr.Slider(
+ label="Number of inference steps",
+ minimum=1,
+ maximum=100,
+ step=1,
+ value=64,
+ )
+ gr.Examples(examples=examples, inputs=[prompt])
+ gr.on(
+ triggers=[run_button.click, prompt.submit],
+ fn=generate_image,
+ inputs=[
+ prompt,
+ negative_prompt,
+ seed,
+ randomize_seed,
+ width,
+ height,
+ guidance_scale,
+ num_inference_steps,
+ ],
+ outputs=[result, seed],
+ )
+
+demo.launch()
\ No newline at end of file
diff --git a/Meissonic/app_Monetico.py b/Meissonic/app_Monetico.py
new file mode 100644
index 0000000000000000000000000000000000000000..f406a996c094c368f5cacc382be490aba04b9ca8
--- /dev/null
+++ b/Meissonic/app_Monetico.py
@@ -0,0 +1,151 @@
+import os
+import sys
+sys.path.append("./")
+
+import torch
+from torchvision import transforms
+from src.transformer import Transformer2DModel
+from src.pipeline import Pipeline
+from src.scheduler import Scheduler
+from transformers import (
+ CLIPTextModelWithProjection,
+ CLIPTokenizer,
+)
+from diffusers import VQModel
+import gradio as gr
+import spaces
+
+device = 'cuda' if torch.cuda.is_available() else 'cpu'
+dtype = torch.bfloat16
+
+model_path = "Collov-Labs/Monetico"
+
+model = Transformer2DModel.from_pretrained(model_path, subfolder="transformer", torch_dtype=dtype)
+vq_model = VQModel.from_pretrained(model_path, subfolder="vqvae", torch_dtype=dtype)
+text_encoder = CLIPTextModelWithProjection.from_pretrained(model_path, subfolder="text_encoder", torch_dtype=dtype) # better for Monetico
+# text_encoder = CLIPTextModelWithProjection.from_pretrained( #more stable sampling for some cases
+# "laion/CLIP-ViT-H-14-laion2B-s32B-b79K", torch_dtype=dtype
+# )
+tokenizer = CLIPTokenizer.from_pretrained(model_path, subfolder="tokenizer", torch_dtype=dtype)
+scheduler = Scheduler.from_pretrained(model_path, subfolder="scheduler", torch_dtype=dtype)
+pipe = Pipeline(vq_model, tokenizer=tokenizer, text_encoder=text_encoder, transformer=model, scheduler=scheduler)
+pipe.to(device)
+
+MAX_SEED = 2**32 - 1
+MAX_IMAGE_SIZE = 512
+
+@spaces.GPU
+def generate_image(prompt, negative_prompt, seed, randomize_seed, width, height, guidance_scale, num_inference_steps, progress=gr.Progress(track_tqdm=True)):
+ if randomize_seed or seed == 0:
+ seed = torch.randint(0, MAX_SEED, (1,)).item()
+ torch.manual_seed(seed)
+
+ image = pipe(
+ prompt=prompt,
+ negative_prompt=negative_prompt,
+ height=height,
+ width=width,
+ guidance_scale=guidance_scale,
+ num_inference_steps=num_inference_steps
+ ).images[0]
+
+ return image, seed
+
+# Default negative prompt
+default_negative_prompt = "worst quality, low quality, low res, blurry, distortion, watermark, logo, signature, text, jpeg artifacts, signature, sketch, duplicate, ugly, identifying mark"
+css = """
+#col-container {
+ margin: 0 auto;
+ max-width: 640px;
+}
+"""
+
+examples = [
+ "Modern Architecture render with pleasing aesthetics.",
+ "An image of a Pikachu wearing a birthday hat and playing guitar.",
+ "A statue of a lion stands in front of a building.",
+ "A white and blue coffee mug with a picture of a man on it.",
+ "A metal sculpture of a deer with antlers.",
+ "A bronze statue of an owl with its wings spread.",
+ "A white table with a vase of flowers and a cup of coffee on top of it.",
+ "A woman stands on a dock in the fog.",
+ "A lion's head is shown in a grayscale image.",
+ "A sculpture of a Greek woman head with a headband and a head of hair."
+]
+
+with gr.Blocks(css=css) as demo:
+ with gr.Column(elem_id="col-container"):
+ gr.Markdown("# Monetico Text-to-Image Generator")
+ with gr.Row():
+ prompt = gr.Text(
+ label="Prompt",
+ show_label=False,
+ max_lines=1,
+ placeholder="Enter your prompt",
+ container=False,
+ )
+ run_button = gr.Button("Run", scale=0, variant="primary")
+ result = gr.Image(label="Result", show_label=False)
+ with gr.Accordion("Advanced Settings", open=False):
+ negative_prompt = gr.Text(
+ label="Negative prompt",
+ max_lines=1,
+ placeholder="Enter a negative prompt",
+ value=default_negative_prompt,
+ )
+ seed = gr.Slider(
+ label="Seed",
+ minimum=0,
+ maximum=MAX_SEED,
+ step=1,
+ value=0,
+ )
+ randomize_seed = gr.Checkbox(label="Randomize seed", value=True)
+ with gr.Row():
+ width = gr.Slider(
+ label="Width",
+ minimum=256,
+ maximum=MAX_IMAGE_SIZE,
+ step=32,
+ value=512,
+ )
+ height = gr.Slider(
+ label="Height",
+ minimum=256,
+ maximum=MAX_IMAGE_SIZE,
+ step=32,
+ value=512,
+ )
+ with gr.Row():
+ guidance_scale = gr.Slider(
+ label="Guidance scale",
+ minimum=0.0,
+ maximum=20.0,
+ step=0.1,
+ value=9.0,
+ )
+ num_inference_steps = gr.Slider(
+ label="Number of inference steps",
+ minimum=1,
+ maximum=100,
+ step=1,
+ value=48,
+ )
+ gr.Examples(examples=examples, inputs=[prompt])
+ gr.on(
+ triggers=[run_button.click, prompt.submit],
+ fn=generate_image,
+ inputs=[
+ prompt,
+ negative_prompt,
+ seed,
+ randomize_seed,
+ width,
+ height,
+ guidance_scale,
+ num_inference_steps,
+ ],
+ outputs=[result, seed],
+ )
+
+demo.launch()
\ No newline at end of file
diff --git a/Meissonic/app_fp8.py b/Meissonic/app_fp8.py
new file mode 100644
index 0000000000000000000000000000000000000000..b9f06fa08e1563d531670144742ad5da1ceccc1f
--- /dev/null
+++ b/Meissonic/app_fp8.py
@@ -0,0 +1,223 @@
+import os
+import sys
+sys.path.append("./")
+
+import torch
+from src.transformer import Transformer2DModel
+from src.pipeline import Pipeline
+from src.scheduler import Scheduler
+from transformers import (
+ CLIPTextModelWithProjection,
+ CLIPTokenizer,
+)
+from diffusers import VQModel
+import gradio as gr
+import time
+from torchao.quantization.quant_api import (
+ quantize_,
+ float8_weight_only,
+)
+
+device = 'cuda'
+
+def get_quantization_method(method):
+ quantization_methods = {
+ 'fp8': lambda: float8_weight_only(),
+ 'none': None
+ }
+ return quantization_methods.get(method, None)
+
+def load_models(quantization_method='none'):
+ model_path = "MeissonFlow/Meissonic"
+ dtype = torch.float16
+ model = Transformer2DModel.from_pretrained(model_path, subfolder="transformer", torch_dtype=dtype)
+ vq_model = VQModel.from_pretrained(model_path, subfolder="vqvae", torch_dtype=dtype)
+ text_encoder = CLIPTextModelWithProjection.from_pretrained(
+ "laion/CLIP-ViT-H-14-laion2B-s32B-b79K",
+ torch_dtype=dtype
+ )
+ tokenizer = CLIPTokenizer.from_pretrained(model_path, subfolder="tokenizer")
+ scheduler = Scheduler.from_pretrained(model_path, subfolder="scheduler")
+
+ if quantization_method != 'none':
+ quant_method = get_quantization_method(quantization_method)
+ if quant_method:
+ quantize_(model, quant_method())
+
+ pipe = Pipeline(vq_model, tokenizer=tokenizer, text_encoder=text_encoder, transformer=model, scheduler=scheduler)
+ return pipe.to(device)
+
+# Global variable to store the pipeline
+global_pipe = None
+current_quantization = 'none'
+
+def initialize_pipeline(quantization):
+ global global_pipe, current_quantization
+ if global_pipe is None or current_quantization != quantization:
+ global_pipe = load_models(quantization)
+ current_quantization = quantization
+ return global_pipe
+
+def generate_images(prompt, negative_prompt, seed, randomize_seed, width, height,
+ guidance_scale, num_inference_steps, quantization_method, batch_size=1,
+ progress=gr.Progress(track_tqdm=True)):
+ if randomize_seed or seed == 0:
+ seed = torch.randint(0, MAX_SEED, (1,)).item()
+ torch.manual_seed(seed)
+
+ # Initialize or update pipeline if needed
+ pipe = initialize_pipeline(quantization_method)
+
+ # Reset CUDA memory stats
+ torch.cuda.reset_peak_memory_stats()
+ start_time = time.time()
+
+ # Handle batch generation
+ if isinstance(prompt, str):
+ prompts = [prompt] * batch_size
+ else:
+ prompts = prompt[:batch_size]
+
+ images = pipe(
+ prompt=prompts,
+ negative_prompt=[negative_prompt] * batch_size,
+ height=height,
+ width=width,
+ guidance_scale=guidance_scale,
+ num_inference_steps=num_inference_steps
+ ).images
+
+ # Calculate performance metrics
+ inference_time = time.time() - start_time
+ memory_used = torch.cuda.max_memory_reserved() / (1024 ** 3) # Convert to GB
+
+ performance_info = f"""
+ Inference Time: {inference_time:.2f} seconds
+ Memory Used: {memory_used:.2f} GB
+ Quantization: {quantization_method}
+ """
+
+ return images[0] if batch_size == 1 else images, seed, performance_info
+
+MAX_SEED = 2**32 - 1
+MAX_IMAGE_SIZE = 1024
+default_negative_prompt = "worst quality, low quality, low res, blurry, distortion, watermark, logo, signature, text, jpeg artifacts, signature, sketch, duplicate, ugly, identifying mark"
+
+examples = [
+ "Two actors are posing for a pictur with one wearing a black and white face paint.",
+ "A large body of water with a rock in the middle and mountains in the background.",
+ "A white and blue coffee mug with a picture of a man on it.",
+ "The sun is setting over a city skyline with a river in the foreground.",
+ "A black and white cat with blue eyes.",
+ "Three boats in the ocean with a rainbow in the sky.",
+ "A robot playing the piano.",
+ "A cat wearing a hat.",
+ "A dog in a jungle."
+]
+
+css = """
+#col-container {
+ margin: 0 auto;
+ max-width: 640px;
+}
+"""
+
+with gr.Blocks(css=css) as demo:
+ with gr.Column(elem_id="col-container"):
+ gr.Markdown("# Meissonic Text-to-Image Generator (with FP8 Support)")
+
+ with gr.Row():
+ prompt = gr.Text(
+ label="Prompt",
+ show_label=False,
+ max_lines=1,
+ placeholder="Enter your prompt",
+ container=False,
+ )
+ run_button = gr.Button("Run", scale=0, variant="primary")
+
+ result = gr.Image(label="Result", show_label=False)
+ performance_info = gr.Textbox(label="Performance Metrics", lines=4)
+
+ with gr.Accordion("Advanced Settings", open=False):
+ quantization = gr.Radio(
+ choices=['none', 'fp8'],
+ value='none',
+ label="Quantization Method",
+ )
+ negative_prompt = gr.Text(
+ label="Negative prompt",
+ max_lines=1,
+ placeholder="Enter a negative prompt",
+ value=default_negative_prompt,
+ )
+ seed = gr.Slider(
+ label="Seed",
+ minimum=0,
+ maximum=MAX_SEED,
+ step=1,
+ value=0,
+ )
+ randomize_seed = gr.Checkbox(label="Randomize seed", value=True)
+
+ with gr.Row():
+ width = gr.Slider(
+ label="Width",
+ minimum=256,
+ maximum=MAX_IMAGE_SIZE,
+ step=32,
+ value=1024,
+ )
+ height = gr.Slider(
+ label="Height",
+ minimum=256,
+ maximum=MAX_IMAGE_SIZE,
+ step=32,
+ value=1024,
+ )
+
+ with gr.Row():
+ guidance_scale = gr.Slider(
+ label="Guidance scale",
+ minimum=0.0,
+ maximum=20.0,
+ step=0.1,
+ value=9.0,
+ )
+ num_inference_steps = gr.Slider(
+ label="Number of inference steps",
+ minimum=1,
+ maximum=100,
+ step=1,
+ value=64,
+ )
+
+ batch_size = gr.Slider(
+ label="Batch Size",
+ minimum=1,
+ maximum=8,
+ step=1,
+ value=1,
+ )
+
+ gr.Examples(examples=examples, inputs=[prompt])
+
+ gr.on(
+ triggers=[run_button.click, prompt.submit],
+ fn=generate_images,
+ inputs=[
+ prompt,
+ negative_prompt,
+ seed,
+ randomize_seed,
+ width,
+ height,
+ guidance_scale,
+ num_inference_steps,
+ quantization,
+ batch_size,
+ ],
+ outputs=[result, seed, performance_info],
+ )
+
+demo.launch()
diff --git a/Meissonic/assets/architecture.png b/Meissonic/assets/architecture.png
new file mode 100644
index 0000000000000000000000000000000000000000..5cc0865dedd936ed3a3453637ce1a8b30608ba82
--- /dev/null
+++ b/Meissonic/assets/architecture.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:174e021d396802a14e454914586ca45d19a31581bd8e6e98c0252eb1c8f4b1c3
+size 327943
diff --git a/Meissonic/assets/demos.pdf b/Meissonic/assets/demos.pdf
new file mode 100644
index 0000000000000000000000000000000000000000..794dbe526fc6cfc843a77c9fb581119c1f602429
--- /dev/null
+++ b/Meissonic/assets/demos.pdf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1d14191e0b8e9fdf4cb3a7199cf36554e60e456cdeba11509d305a8201e6b131
+size 2476203
diff --git a/Meissonic/assets/demos.png b/Meissonic/assets/demos.png
new file mode 100644
index 0000000000000000000000000000000000000000..28073db0290f0e1a8c40c88c0249773a11fcde92
--- /dev/null
+++ b/Meissonic/assets/demos.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:79322f0c5ba7093d2d5e2274f6d52e257d0063573eab49b19243aefbed63dd5e
+size 1828570
diff --git a/Meissonic/assets/inpaint/0eKR4M2uuL8.jpg b/Meissonic/assets/inpaint/0eKR4M2uuL8.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..4c6a908dc01284ed3e3ce1a785762f3b8144fae4
--- /dev/null
+++ b/Meissonic/assets/inpaint/0eKR4M2uuL8.jpg
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2ee6f2d8fae720821257db75cc819919a780f644f7a7a3e83aab0bdaccb13d53
+size 1088383
diff --git a/Meissonic/assets/inpaint/0eKR4M2uuL8.png b/Meissonic/assets/inpaint/0eKR4M2uuL8.png
new file mode 100644
index 0000000000000000000000000000000000000000..0a44ef73244a5c7f3bdb69a1b42eee6caa922431
Binary files /dev/null and b/Meissonic/assets/inpaint/0eKR4M2uuL8.png differ
diff --git a/Meissonic/assets/inpaint/_Rh_zxIUWXA.jpg b/Meissonic/assets/inpaint/_Rh_zxIUWXA.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..2a556e92e7deeb04335b7b43b0f5af2f44c30c7c
Binary files /dev/null and b/Meissonic/assets/inpaint/_Rh_zxIUWXA.jpg differ
diff --git a/Meissonic/assets/inpaint/_Rh_zxIUWXA.png b/Meissonic/assets/inpaint/_Rh_zxIUWXA.png
new file mode 100644
index 0000000000000000000000000000000000000000..32011f9ec1c7f06020a82263a751a4faf76cac02
Binary files /dev/null and b/Meissonic/assets/inpaint/_Rh_zxIUWXA.png differ
diff --git a/Meissonic/assets/inpaint/__Owak0IgJk.jpg b/Meissonic/assets/inpaint/__Owak0IgJk.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..1bd6703272a5ddb1c5614e7c2d440060611a927f
--- /dev/null
+++ b/Meissonic/assets/inpaint/__Owak0IgJk.jpg
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a08fa5cd031f119cb3123d767f2358b5081d6bc7dc1b017081f66bb824e86ae0
+size 109431
diff --git a/Meissonic/assets/inpaint/__Owak0IgJk.png b/Meissonic/assets/inpaint/__Owak0IgJk.png
new file mode 100644
index 0000000000000000000000000000000000000000..4a1185857570efc99b201e5947420faaee7a7a43
Binary files /dev/null and b/Meissonic/assets/inpaint/__Owak0IgJk.png differ
diff --git a/Meissonic/assets/inpaint/cases.json b/Meissonic/assets/inpaint/cases.json
new file mode 100644
index 0000000000000000000000000000000000000000..398136324425e59ae9eabc018e147512c31dcf93
--- /dev/null
+++ b/Meissonic/assets/inpaint/cases.json
@@ -0,0 +1,20 @@
+[
+ {
+ "input":"./assets/inpaint/_Rh_zxIUWXA.jpg",
+ "mask": "./assets/inpaint/_Rh_zxIUWXA.png",
+ "prompt": "A woman with short hair wears a silver gas mask.",
+ "negative_prompts": null
+ },
+ {
+ "input":"./assets/inpaint/0eKR4M2uuL8.jpg",
+ "mask": "./assets/inpaint/0eKR4M2uuL8.png",
+ "prompt": "A stylish dog wearing sunglasses.",
+ "negative_prompts": null
+ },
+ {
+ "input":"./assets/inpaint/__Owak0IgJk.jpg",
+ "mask": "./assets/inpaint/__Owak0IgJk.png",
+ "prompt": "A woman wearing a white suspender skirt is sitting",
+ "negative_prompts": null
+ }
+]
\ No newline at end of file
diff --git a/Meissonic/assets/outpaint/__G2yFuW7jQ.jpg b/Meissonic/assets/outpaint/__G2yFuW7jQ.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..58152afe80a86ff90ac12b443d05d49db4978369
--- /dev/null
+++ b/Meissonic/assets/outpaint/__G2yFuW7jQ.jpg
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5e98e81fa367ade7edd41cd60769d04ab6a7593dea3b3ec35f3f0e84634669de
+size 120864
diff --git a/Meissonic/assets/outpaint/__G2yFuW7jQ.png b/Meissonic/assets/outpaint/__G2yFuW7jQ.png
new file mode 100644
index 0000000000000000000000000000000000000000..df01f550cd0bdce62dba537055e2fe664297ec15
Binary files /dev/null and b/Meissonic/assets/outpaint/__G2yFuW7jQ.png differ
diff --git a/Meissonic/assets/outpaint/cases.json b/Meissonic/assets/outpaint/cases.json
new file mode 100644
index 0000000000000000000000000000000000000000..a50b7c7adcf635d3b6c35ef2cb7be43eab84cd46
--- /dev/null
+++ b/Meissonic/assets/outpaint/cases.json
@@ -0,0 +1,20 @@
+[
+ {
+ "input":"./assets/outpaint/__G2yFuW7jQ.jpg",
+ "mask": "./assets/outpaint/__G2yFuW7jQ.png",
+ "prompt": "fall mountains",
+ "negative_prompts": "The artwork avoids the pitfalls of bad art, such as ugly and deformed eyes and faces, poorly drawn, blurry, and disfigured bodies with extra limbs and close-ups that look weird. It also avoids other common issues such as watermarking, text errors, missing fingers or digits, cropping, poor quality, and JPEG artifacts. The artwork is free of signature or watermark and avoids framing issues.The hands are not deformed, the eyes are not disfigured, and there areno extra bodies or limbs. The artwork is not blurry, out of focus, or poorly drawn, and the proportions are not bad or deformed. There are no mutations, missing limbs, or floating or disconnected limbs. The hands and neck are not malformed, and there are no extra heads or out-of-frame elements. The artwork is not low-res or disgusting and is a well-drawn, highly detailed, and beautiful rendering."
+ },
+ {
+ "input":"./assets/outpaint/__G2yFuW7jQ.jpg",
+ "mask": "./assets/outpaint/__G2yFuW7jQ.png",
+ "prompt": "Rocket launch site",
+ "negative_prompts": "The artwork avoids the pitfalls of bad art, such as ugly and deformed eyes and faces, poorly drawn, blurry, and disfigured bodies with extra limbs and close-ups that look weird. It also avoids other common issues such as watermarking, text errors, missing fingers or digits, cropping, poor quality, and JPEG artifacts. The artwork is free of signature or watermark and avoids framing issues.The hands are not deformed, the eyes are not disfigured, and there areno extra bodies or limbs. The artwork is not blurry, out of focus, or poorly drawn, and the proportions are not bad or deformed. There are no mutations, missing limbs, or floating or disconnected limbs. The hands and neck are not malformed, and there are no extra heads or out-of-frame elements. The artwork is not low-res or disgusting and is a well-drawn, highly detailed, and beautiful rendering."
+ },
+ {
+ "input":"./assets/outpaint/__G2yFuW7jQ.jpg",
+ "mask": "./assets/outpaint/__G2yFuW7jQ.png",
+ "prompt": "Volcano",
+ "negative_prompts": "The artwork avoids the pitfalls of bad art, such as ugly and deformed eyes and faces, poorly drawn, blurry, and disfigured bodies with extra limbs and close-ups that look weird. It also avoids other common issues such as watermarking, text errors, missing fingers or digits, cropping, poor quality, and JPEG artifacts. The artwork is free of signature or watermark and avoids framing issues.The hands are not deformed, the eyes are not disfigured, and there areno extra bodies or limbs. The artwork is not blurry, out of focus, or poorly drawn, and the proportions are not bad or deformed. There are no mutations, missing limbs, or floating or disconnected limbs. The hands and neck are not malformed, and there are no extra heads or out-of-frame elements. The artwork is not low-res or disgusting and is a well-drawn, highly detailed, and beautiful rendering."
+ }
+]
\ No newline at end of file
diff --git a/Meissonic/cog.yaml b/Meissonic/cog.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..62a932f5e559b60da72737c697eaafdd5521d7cd
--- /dev/null
+++ b/Meissonic/cog.yaml
@@ -0,0 +1,29 @@
+# Configuration for Cog ⚙️
+# Reference: https://cog.run/yaml
+
+build:
+ # set to true if your model requires a GPU
+ gpu: true
+
+ # a list of ubuntu apt packages to install
+ system_packages:
+ - "libgl1-mesa-glx"
+ - "libglib2.0-0"
+
+ # python version in the form '3.11' or '3.11.4'
+ python_version: "3.11"
+
+ # a list of packages in the format ==
+ python_packages:
+ - torch
+ - torchvision
+ - git+https://github.com/huggingface/diffusers.git
+ - accelerate
+ - transformers
+
+ # commands run after the environment is setup
+ run:
+ - curl -o /usr/local/bin/pget -L "https://github.com/replicate/pget/releases/download/v0.8.2/pget_linux_x86_64" && chmod +x /usr/local/bin/pget
+
+# predict.py defines how predictions are run on your model
+predict: "predict.py:Predictor"
diff --git a/Meissonic/cosmos_test_output/comparison_grid_video_0.png b/Meissonic/cosmos_test_output/comparison_grid_video_0.png
new file mode 100644
index 0000000000000000000000000000000000000000..b4a67c150ac34219cba77f43b57b30d25c6a26b0
--- /dev/null
+++ b/Meissonic/cosmos_test_output/comparison_grid_video_0.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:73ee850b107ae6842e8c8a528689bfcb930ad7a5d863a9c7f1e2245f82650fa2
+size 394999
diff --git a/Meissonic/cosmos_test_output/comparison_grid_video_1.png b/Meissonic/cosmos_test_output/comparison_grid_video_1.png
new file mode 100644
index 0000000000000000000000000000000000000000..363ded074c9441f39fba859f3bd51cccbfecc6da
--- /dev/null
+++ b/Meissonic/cosmos_test_output/comparison_grid_video_1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:80853c45d1431ab2dee94b554c6d9e64f2d17ff1257aa72f560cf3040ea39f27
+size 8286293
diff --git a/Meissonic/cosmos_test_output/comparison_grid_video_2.png b/Meissonic/cosmos_test_output/comparison_grid_video_2.png
new file mode 100644
index 0000000000000000000000000000000000000000..9ccf9964a5b144c5de2f9e736062e302be36526c
--- /dev/null
+++ b/Meissonic/cosmos_test_output/comparison_grid_video_2.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f8a285e5661e7100963aada2ca27ead641867d989f6ffaa37f7c1f5d873adfbd
+size 9085956
diff --git a/Meissonic/cosmos_test_output/comparison_grid_video_3.png b/Meissonic/cosmos_test_output/comparison_grid_video_3.png
new file mode 100644
index 0000000000000000000000000000000000000000..f6c8809b638f2a84a7f05be19c136653958f8fb6
--- /dev/null
+++ b/Meissonic/cosmos_test_output/comparison_grid_video_3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e56e9d5e0cbfbf740d1c53c2d105616e3dc7781e1c8945d712b5edc583ad715f
+size 8634545
diff --git a/Meissonic/cosmos_test_output/metrics_video_0.txt b/Meissonic/cosmos_test_output/metrics_video_0.txt
new file mode 100644
index 0000000000000000000000000000000000000000..671a60175ef523c994215a0935b89b04eba172b5
--- /dev/null
+++ b/Meissonic/cosmos_test_output/metrics_video_0.txt
@@ -0,0 +1,12 @@
+Video Index: 0
+Video Path: 000/000/000/0.mp4
+Caption: In the video, a man is seen in a living room setting, standing in front of a window with blinds. He is wearing a black sweater and appears to be in the middle of a conversation. The room is dimly lit, with a lamp providing a soft glow in the background. The man's expression is serious, suggesting that the conversation is of importance. The overall style of the video is realistic and naturalistic, capturing a candid moment in the man's life.
+
+=== Metrics ===
+Average PSNR: 27.54 dB
+Average MSE: 0.001764
+Average SSIM: 0.9779
+
+Per-frame PSNR: [26.747089385986328, 27.265975952148438, 27.32347297668457, 27.352922439575195, 27.334339141845703, 27.782726287841797, 27.661243438720703, 27.803525924682617, 27.705425262451172, 27.679603576660156, 27.297304153442383, 27.51146125793457, 27.4649658203125, 27.89719581604004, 27.753822326660156, 27.86109161376953, 27.80060577392578]
+Per-frame MSE: [0.002114907605573535, 0.0018767336150631309, 0.0018520501907914877, 0.001839534263126552, 0.001847422681748867, 0.001666200696490705, 0.0017134671797975898, 0.0016582406824454665, 0.0016961240908131003, 0.0017062382539734244, 0.0018632437568157911, 0.001773593365214765, 0.001792682334780693, 0.0016228572931140661, 0.0016773275565356016, 0.0016364054754376411, 0.0016593559412285686]
+Per-frame SSIM: [0.9738484025001526, 0.9765720963478088, 0.9768512845039368, 0.9769563674926758, 0.9766926765441895, 0.9789631962776184, 0.9783727526664734, 0.9790931344032288, 0.9786423444747925, 0.9784184694290161, 0.9765286445617676, 0.9777430295944214, 0.9775411486625671, 0.9799171090126038, 0.9794126749038696, 0.9798620343208313, 0.9795750379562378]
diff --git a/Meissonic/cosmos_test_output/metrics_video_1.txt b/Meissonic/cosmos_test_output/metrics_video_1.txt
new file mode 100644
index 0000000000000000000000000000000000000000..4222a4c6b10cff29079574967ed3908eebca46ee
--- /dev/null
+++ b/Meissonic/cosmos_test_output/metrics_video_1.txt
@@ -0,0 +1,12 @@
+Video Index: 1
+Video Path: 000/000/001/1.mp4
+Caption: The video shows a man standing next to a purple van with a floral design on the side. The man is wearing a black t-shirt and jeans, and he is smiling and waving his hands in the air. The van has pink rims and a black roof rack. The van is parked in front of a building with a glass door. The man appears to be happy and excited about the van. The video is likely a short clip of a man showing off his van.
+
+=== Metrics ===
+Average PSNR: 25.14 dB
+Average MSE: 0.003232
+Average SSIM: 0.9700
+
+Per-frame PSNR: [29.570905685424805, 25.845619201660156, 24.151002883911133, 24.53882598876953, 26.607555389404297, 23.609159469604492, 23.5848445892334, 24.532224655151367, 26.290340423583984, 23.606443405151367, 23.633737564086914, 24.562894821166992, 26.255611419677734, 24.259323120117188, 25.643463134765625, 25.491649627685547]
+Per-frame MSE: [0.0011038482189178467, 0.002602784661576152, 0.003845029277727008, 0.0035165559966117144, 0.0021839593537151814, 0.004355963785201311, 0.004380417056381702, 0.003521904582157731, 0.0023494488559663296, 0.004358689300715923, 0.004331381060183048, 0.0034971192944794893, 0.0023683111649006605, 0.0037503137718886137, 0.0027268033009022474, 0.002823807764798403]
+Per-frame SSIM: [0.9893906712532043, 0.974087119102478, 0.9622812867164612, 0.9658014178276062, 0.9791845083236694, 0.957465648651123, 0.957618772983551, 0.9664595127105713, 0.977811872959137, 0.9590921998023987, 0.960818350315094, 0.9692971110343933, 0.9799112677574158, 0.9681466817855835, 0.9765862822532654, 0.9759609699249268]
diff --git a/Meissonic/cosmos_test_output/metrics_video_2.txt b/Meissonic/cosmos_test_output/metrics_video_2.txt
new file mode 100644
index 0000000000000000000000000000000000000000..ecafb94fe13e5e9bd2a5a3c5cd57c9512b6d533f
--- /dev/null
+++ b/Meissonic/cosmos_test_output/metrics_video_2.txt
@@ -0,0 +1,12 @@
+Video Index: 2
+Video Path: 000/000/002/2.mp4
+Caption: The video is a news segment featuring a man in a red baseball cap and a blue vest, standing in front of a statue of a soldier and two children. The man appears to be a veteran, as indicated by the cap and the context of the event. The event is an honorary ceremony for lost submarines and submarine veterans, taking place near the World Peace Bell in Newport. The news segment is titled "Connected to the Community" and is scheduled to air at 11:10 PM on ABC 9. The style of the video is informative and respectful, focusing on the man and the event, with a clear and concise presentation of the details.
+
+=== Metrics ===
+Average PSNR: 22.09 dB
+Average MSE: 0.006399
+Average SSIM: 0.9607
+
+Per-frame PSNR: [24.496965408325195, 22.367679595947266, 22.21709442138672, 22.679195404052734, 23.883594512939453, 22.220516204833984, 22.20623207092285, 21.4675350189209, 22.316797256469727, 19.425098419189453, 21.102333068847656, 21.321147918701172, 23.025981903076172, 21.053565979003906, 21.95743179321289, 21.684494018554688]
+Per-frame MSE: [0.0035506151616573334, 0.005797383841127157, 0.006001925095915794, 0.005396105814725161, 0.0040892185643315315, 0.00599720049649477, 0.006016954779624939, 0.007132581900805235, 0.005865707993507385, 0.011415375396609306, 0.007758304942399263, 0.007377093657851219, 0.00498197739943862, 0.007845907472074032, 0.006371723022311926, 0.0067850141786038876]
+Per-frame SSIM: [0.9770643711090088, 0.9623730778694153, 0.9605475068092346, 0.9618802070617676, 0.9713600277900696, 0.9565339088439941, 0.9568989872932434, 0.9560506939888, 0.9673117399215698, 0.9364117383956909, 0.9567262530326843, 0.9589394927024841, 0.9706904888153076, 0.9546973705291748, 0.9623250365257263, 0.9610125422477722]
diff --git a/Meissonic/cosmos_test_output/metrics_video_3.txt b/Meissonic/cosmos_test_output/metrics_video_3.txt
new file mode 100644
index 0000000000000000000000000000000000000000..541d12165559e95bc4d39145fcb59be4c5b948d3
--- /dev/null
+++ b/Meissonic/cosmos_test_output/metrics_video_3.txt
@@ -0,0 +1,12 @@
+Video Index: 3
+Video Path: 000/000/003/3.mp4
+Caption: The video features a man in a pink shirt and a black bucket hat, wearing glasses and a necklace. He is holding a spoon and making a playful face, as if he is about to eat something. The background shows a lush garden with trees and a wooden structure. The man's expression and the spoon suggest that he is about to taste something, possibly food. The overall style of the video is casual and fun, with a focus on the man's reaction to the food.
+
+=== Metrics ===
+Average PSNR: 26.22 dB
+Average MSE: 0.002459
+Average SSIM: 0.9856
+
+Per-frame PSNR: [27.509328842163086, 26.409242630004883, 25.4619140625, 25.407241821289062, 26.446935653686523, 23.73136329650879, 25.60137176513672, 26.993793487548828, 28.306987762451172, 25.729787826538086, 25.27326774597168, 26.266807556152344, 27.462078094482422, 25.950550079345703, 26.63888168334961, 26.327953338623047]
+Per-frame MSE: [0.001774463220499456, 0.0022859980817884207, 0.002843207446858287, 0.0028792270459234715, 0.0022662426345050335, 0.004235099535435438, 0.002753359731286764, 0.0019981153309345245, 0.0014767315005883574, 0.002673137467354536, 0.00296943006105721, 0.0023622140288352966, 0.0017938758246600628, 0.0025406500790268183, 0.0021682626102119684, 0.002329188399016857]
+Per-frame SSIM: [0.9894547462463379, 0.9864593744277954, 0.983065664768219, 0.9832437634468079, 0.9867878556251526, 0.9748696088790894, 0.9840085506439209, 0.9884393215179443, 0.991378664970398, 0.9843630194664001, 0.9830930829048157, 0.986225962638855, 0.989523708820343, 0.9850807189941406, 0.9871661067008972, 0.9861734509468079]
diff --git a/Meissonic/inference.py b/Meissonic/inference.py
new file mode 100644
index 0000000000000000000000000000000000000000..e67cdb72c941f77ff27675cac53ff2f53a0cf03b
--- /dev/null
+++ b/Meissonic/inference.py
@@ -0,0 +1,65 @@
+import os
+import sys
+sys.path.append("./")
+
+import torch
+from torchvision import transforms
+from src.transformer import Transformer2DModel
+from src.pipeline import Pipeline
+from src.scheduler import Scheduler
+from transformers import (
+ CLIPTextModelWithProjection,
+ CLIPTokenizer,
+)
+from diffusers import VQModel
+
+device = 'cuda'
+
+model_path = "MeissonFlow/Meissonic"
+model = Transformer2DModel.from_pretrained(model_path,subfolder="transformer",)
+vq_model = VQModel.from_pretrained(model_path, subfolder="vqvae", )
+# text_encoder = CLIPTextModelWithProjection.from_pretrained(model_path,subfolder="text_encoder",)
+text_encoder = CLIPTextModelWithProjection.from_pretrained( #using original text enc for stable sampling
+ "laion/CLIP-ViT-H-14-laion2B-s32B-b79K"
+ )
+tokenizer = CLIPTokenizer.from_pretrained(model_path,subfolder="tokenizer",)
+scheduler = Scheduler.from_pretrained(model_path,subfolder="scheduler",)
+pipe=Pipeline(vq_model, tokenizer=tokenizer,text_encoder=text_encoder,transformer=model,scheduler=scheduler)
+
+pipe = pipe.to(device)
+
+steps = 64
+CFG = 9
+resolution = 1024
+negative_prompt = "worst quality, low quality, low res, blurry, distortion, watermark, logo, signature, text, jpeg artifacts, signature, sketch, duplicate, ugly, identifying mark"
+
+prompts = [
+ "Two actors are posing for a pictur with one wearing a black and white face paint.",
+ "A large body of water with a rock in the middle and mountains in the background.",
+ "A white and blue coffee mug with a picture of a man on it.",
+ "A statue of a man with a crown on his head.",
+ "A man in a yellow wet suit is holding a big black dog in the water.",
+ "A white table with a vase of flowers and a cup of coffee on top of it.",
+ "A woman stands on a dock in the fog.",
+ "A woman is standing next to a picture of another woman."
+]
+
+batched_generation = False
+num_images = len(prompts) if batched_generation else 1
+
+images = pipe(
+ prompt=prompts[:num_images],
+ negative_prompt=[negative_prompt] * num_images,
+ height=resolution,
+ width=resolution,
+ guidance_scale=CFG,
+ num_inference_steps=steps
+ ).images
+
+output_dir = "./output"
+os.makedirs(output_dir, exist_ok=True)
+for i, prompt in enumerate(prompts[:num_images]):
+ sanitized_prompt = prompt.replace(" ", "_")
+ file_path = os.path.join(output_dir, f"{sanitized_prompt}_{resolution}_{steps}_{CFG}.png")
+ images[i].save(file_path)
+ print(f"The {i+1}/{num_images} image is saved to {file_path}")
diff --git a/Meissonic/inference_fp16.py b/Meissonic/inference_fp16.py
new file mode 100644
index 0000000000000000000000000000000000000000..3d7af22ff083b4606b95bd928e0a674ed83ace29
--- /dev/null
+++ b/Meissonic/inference_fp16.py
@@ -0,0 +1,64 @@
+import os
+import sys
+sys.path.append("./")
+
+import torch
+from torchvision import transforms
+from src.transformer import Transformer2DModel
+from src.pipeline import Pipeline
+from src.scheduler import Scheduler
+from transformers import (
+ CLIPTextModelWithProjection,
+ CLIPTokenizer,
+)
+from diffusers import VQModel
+
+device = 'cuda'
+dtype = torch.bfloat16
+model_path = "MeissonFlow/Meissonic"
+model = Transformer2DModel.from_pretrained(model_path, subfolder="transformer", torch_dtype=dtype)
+vq_model = VQModel.from_pretrained(model_path, subfolder="vqvae", torch_dtype=dtype)
+# text_encoder = CLIPTextModelWithProjection.from_pretrained(model_path,subfolder="text_encoder", torch_dtype=dtype)
+text_encoder = CLIPTextModelWithProjection.from_pretrained( #using original text enc for stable sampling
+ "laion/CLIP-ViT-H-14-laion2B-s32B-b79K",torch_dtype=dtype)
+tokenizer = CLIPTokenizer.from_pretrained(model_path, subfolder="tokenizer", torch_dtype=dtype)
+scheduler = Scheduler.from_pretrained(model_path, subfolder="scheduler")
+pipe=Pipeline(vq_model, tokenizer=tokenizer,text_encoder=text_encoder,transformer=model,scheduler=scheduler)
+
+pipe = pipe.to(device)
+
+steps = 64
+CFG = 9
+resolution = 1024
+negative_prompt = "worst quality, low quality, low res, blurry, distortion, watermark, logo, signature, text, jpeg artifacts, signature, sketch, duplicate, ugly, identifying mark"
+
+prompts = [
+ "Two actors are posing for a pictur with one wearing a black and white face paint.",
+ "A large body of water with a rock in the middle and mountains in the background.",
+ "A white and blue coffee mug with a picture of a man on it.",
+ "A statue of a man with a crown on his head.",
+ "A man in a yellow wet suit is holding a big black dog in the water.",
+ "A white table with a vase of flowers and a cup of coffee on top of it.",
+ "A woman stands on a dock in the fog.",
+ "A woman is standing next to a picture of another woman."
+]
+
+batched_generation = False
+num_images = len(prompts) if batched_generation else 1
+
+images = pipe(
+ prompt=prompts[:num_images],
+ negative_prompt=[negative_prompt] * num_images,
+ height=resolution,
+ width=resolution,
+ guidance_scale=CFG,
+ num_inference_steps=steps
+ ).images
+
+output_dir = "./output"
+os.makedirs(output_dir, exist_ok=True)
+for i, prompt in enumerate(prompts[:num_images]):
+ sanitized_prompt = prompt.replace(" ", "_")
+ file_path = os.path.join(output_dir, f"{sanitized_prompt}_{resolution}_{steps}_{CFG}.png")
+ images[i].save(file_path)
+ print(f"The {i+1}/{num_images} image is saved to {file_path}")
diff --git a/Meissonic/inference_fp16_Monetico.py b/Meissonic/inference_fp16_Monetico.py
new file mode 100644
index 0000000000000000000000000000000000000000..0460597cc5a11813b218a95bff1201d88a2014e4
--- /dev/null
+++ b/Meissonic/inference_fp16_Monetico.py
@@ -0,0 +1,64 @@
+import os
+import sys
+sys.path.append("./")
+
+import torch
+from torchvision import transforms
+from src.transformer import Transformer2DModel
+from src.pipeline import Pipeline
+from src.scheduler import Scheduler
+from transformers import (
+ CLIPTextModelWithProjection,
+ CLIPTokenizer,
+)
+from diffusers import VQModel
+
+device = 'cuda'
+dtype = torch.bfloat16
+model_path = "Collov-Labs/Monetico"
+model = Transformer2DModel.from_pretrained(model_path, subfolder="transformer", torch_dtype=dtype)
+vq_model = VQModel.from_pretrained(model_path, subfolder="vqvae", torch_dtype=dtype)
+text_encoder = CLIPTextModelWithProjection.from_pretrained(model_path, subfolder="text_encoder", torch_dtype=dtype) # better for Monetico
+# text_encoder = CLIPTextModelWithProjection.from_pretrained( #more stable sampling for some cases
+# "laion/CLIP-ViT-H-14-laion2B-s32B-b79K", torch_dtype=dtype
+# )
+tokenizer = CLIPTokenizer.from_pretrained(model_path, subfolder="tokenizer", torch_dtype=dtype)
+scheduler = Scheduler.from_pretrained(model_path, subfolder="scheduler", torch_dtype=dtype)
+pipe = Pipeline(vq_model, tokenizer=tokenizer, text_encoder=text_encoder, transformer=model, scheduler=scheduler)
+pipe.to(device)
+
+steps = 48
+CFG = 9
+resolution = 512
+negative_prompt = "worst quality, low quality, low res, blurry, distortion, watermark, logo, signature, text, jpeg artifacts, signature, sketch, duplicate, ugly, identifying mark"
+
+prompts = [
+ "Two actors are posing for a pictur with one wearing a black and white face paint.",
+ "A large body of water with a rock in the middle and mountains in the background.",
+ "A white and blue coffee mug with a picture of a man on it.",
+ "A statue of a man with a crown on his head.",
+ "A man in a yellow wet suit is holding a big black dog in the water.",
+ "A white table with a vase of flowers and a cup of coffee on top of it.",
+ "A woman stands on a dock in the fog.",
+ "A woman is standing next to a picture of another woman."
+]
+
+batched_generation = False
+num_images = len(prompts) if batched_generation else 1
+
+images = pipe(
+ prompt=prompts[:num_images],
+ negative_prompt=[negative_prompt] * num_images,
+ height=resolution,
+ width=resolution,
+ guidance_scale=CFG,
+ num_inference_steps=steps
+ ).images
+
+output_dir = "./output"
+os.makedirs(output_dir, exist_ok=True)
+for i, prompt in enumerate(prompts[:num_images]):
+ sanitized_prompt = prompt.replace(" ", "_")
+ file_path = os.path.join(output_dir, f"{sanitized_prompt}_{resolution}_{steps}_{CFG}.png")
+ images[i].save(file_path)
+ print(f"The {i+1}/{num_images} image is saved to {file_path}")
diff --git a/Meissonic/inference_fp8.py b/Meissonic/inference_fp8.py
new file mode 100644
index 0000000000000000000000000000000000000000..2804af3a06a02400fb69366d414bf419ade183b6
--- /dev/null
+++ b/Meissonic/inference_fp8.py
@@ -0,0 +1,103 @@
+import os
+import sys
+sys.path.append("./")
+
+import torch
+from src.transformer import Transformer2DModel
+from src.pipeline import Pipeline
+from src.scheduler import Scheduler
+from transformers import (
+ CLIPTextModelWithProjection,
+ CLIPTokenizer,
+)
+from diffusers import VQModel
+import time
+import argparse
+
+from torchao.quantization.quant_api import (
+ quantize_,
+ float8_weight_only, # A8W8 FP8
+)
+
+device = 'cuda'
+
+def get_quantization_method(method):
+ quantization_methods = {
+ 'fp8': lambda: float8_weight_only(),
+ }
+ return quantization_methods.get(method, None)
+
+def load_models(quantization_method=None):
+ model_path = "MeissonFlow/Meissonic"
+ dtype = torch.float16
+ model = Transformer2DModel.from_pretrained(model_path, subfolder="transformer", torch_dtype=dtype)
+ vq_model = VQModel.from_pretrained(model_path, subfolder="vqvae", torch_dtype=dtype)
+ text_encoder = CLIPTextModelWithProjection.from_pretrained(
+ "laion/CLIP-ViT-H-14-laion2B-s32B-b79K",
+ torch_dtype=dtype
+ )
+ tokenizer = CLIPTokenizer.from_pretrained(model_path, subfolder="tokenizer")
+ scheduler = Scheduler.from_pretrained(model_path, subfolder="scheduler")
+
+ if quantization_method:
+ quant_method = get_quantization_method(quantization_method)
+ if quant_method:
+ quantize_(model, quant_method())
+ else:
+ print(f"Unsupported quantization method: {quantization_method}")
+
+
+ pipe = Pipeline(vq_model, tokenizer=tokenizer, text_encoder=text_encoder, transformer=model, scheduler=scheduler)
+ return pipe.to(device)
+
+def run_inference(pipe, prompt, negative_prompt, resolution, cfg, steps):
+ return pipe(prompt=prompt, negative_prompt=negative_prompt, height=resolution, width=resolution, guidance_scale=cfg, num_inference_steps=steps).images[0]
+
+def main(quantization_method):
+ steps = 64
+ CFG = 9
+ resolution = 1024
+ negative_prompts = "worst quality, low quality, low res, blurry, distortion, watermark, logo, signature, text, jpeg artifacts, signature, sketch, duplicate, ugly, identifying mark"
+
+ prompts = [
+ "Two actors are posing for a pictur with one wearing a black and white face paint.",
+ "A large body of water with a rock in the middle and mountains in the background.",
+ "A white and blue coffee mug with a picture of a man on it.",
+ "The sun is setting over a city skyline with a river in the foreground.",
+ "A black and white cat with blue eyes.",
+ "Three boats in the ocean with a rainbow in the sky.",
+ "A robot playing the piano.",
+ "A cat wearing a hat.",
+ "A dog in a jungle.",
+ ]
+
+ output_dir = "./output"
+ os.makedirs(output_dir, exist_ok=True)
+
+ pipe = load_models(quantization_method)
+ start_time = time.time()
+ total_memory_used = 0
+ for i, prompt in enumerate(prompts):
+ torch.cuda.reset_peak_memory_stats()
+ image_start_time = time.time()
+ image = run_inference(pipe, prompt, negative_prompts, resolution, CFG, steps)
+ image_end_time = time.time()
+ image.save(os.path.join(output_dir, f"{prompt[:10]}_{resolution}_{steps}_{CFG}_{quantization_method}.png"))
+
+ memory_used = torch.cuda.max_memory_reserved() / (1024 ** 3) # Convert to GB
+ total_memory_used += memory_used
+
+ print(f"Image {i+1} time: {image_end_time - image_start_time:.2f} seconds")
+ print(f"Image {i+1} max memory used: {memory_used:.2f} GB")
+
+ total_time = time.time() - start_time
+ avg_memory_used = total_memory_used / len(prompts)
+ print(f"Total inference time ({quantization_method}): {total_time:.2f} seconds")
+ print(f"Average memory used per image: {avg_memory_used:.2f} GB")
+
+if __name__ == "__main__":
+ parser = argparse.ArgumentParser(description="Run inference with specified quantization method.")
+ parser.add_argument("--quantization", type=str, choices=['fp8'],
+ help="Quantization method to use")
+ args = parser.parse_args()
+ main(args.quantization)
diff --git a/Meissonic/inpaint.py b/Meissonic/inpaint.py
new file mode 100644
index 0000000000000000000000000000000000000000..4fa97623ac9d05d51645a3e61f6cb35a2f90327b
--- /dev/null
+++ b/Meissonic/inpaint.py
@@ -0,0 +1,55 @@
+import os
+import sys
+sys.path.append("./")
+
+import argparse
+import json
+from PIL import Image
+from src.transformer import Transformer2DModel
+from src.pipeline_inpaint import InpaintPipeline
+from src.scheduler import Scheduler
+from transformers import (
+ CLIPTextModelWithProjection,
+ CLIPTokenizer,
+)
+from diffusers import VQModel
+
+def get_parse_args():
+ parser = argparse.ArgumentParser(description="Meissonic Inpaint and Outpaint")
+ parser.add_argument("--mode", type=str,default="inpaint", choices=["inpaint", "outpaint"], help="Inpaint or Outpaint")
+ return parser.parse_args()
+
+if __name__ == "__main__":
+ args = get_parse_args()
+ device = 'cuda'
+
+ model_path = "MeissonFlow/Meissonic"
+ model = Transformer2DModel.from_pretrained(model_path, subfolder="transformer", )
+ vq_model = VQModel.from_pretrained(model_path, subfolder="vqvae", )
+ # text_encoder = CLIPTextModelWithProjection.from_pretrained(model_path,subfolder="text_encoder",)
+ text_encoder = CLIPTextModelWithProjection.from_pretrained( # using original text enc for stable sampling
+ "laion/CLIP-ViT-H-14-laion2B-s32B-b79K"
+ )
+ tokenizer = CLIPTokenizer.from_pretrained(model_path, subfolder="tokenizer", )
+ scheduler = Scheduler.from_pretrained(model_path, subfolder="scheduler", )
+
+ pipe=InpaintPipeline(vq_model, tokenizer=tokenizer,text_encoder=text_encoder,transformer=model,scheduler=scheduler)
+ pipe = pipe.to(device)
+
+ with open(f"./assets/{args.mode}/cases.json", 'r', encoding='utf-8') as file:
+ cases = json.load(file)
+ item = cases[0]
+
+ steps = 64
+ CFG = 9
+ resolution = 1024
+ negative_prompts = item["negative_prompts"] if "negative_prompts" in item.keys() else "worst quality, low quality, low res, blurry, distortion, watermark, logo, signature, text, jpeg artifacts, signature, sketch, duplicate, ugly, identifying mark"
+
+ image = Image.open(item["input"]).resize((resolution, resolution)).convert("RGB")
+ mask = Image.open(item["mask"]).resize((resolution, resolution)).convert("RGB")
+
+ image = pipe(prompt=item["prompt"],negative_prompt=negative_prompts,image =image, mask_image =mask, guidance_scale=CFG, num_inference_steps=steps).images[0]
+
+ output_dir = "./output"
+ os.makedirs(output_dir, exist_ok=True)
+ image.save(os.path.join(output_dir, f"{item['prompt'][:10]}_{resolution}_{steps}_{CFG}.png"))
\ No newline at end of file
diff --git a/Meissonic/predict.py b/Meissonic/predict.py
new file mode 100644
index 0000000000000000000000000000000000000000..0d24702bbaa8ba3fd16f4588d377808e83f5e240
--- /dev/null
+++ b/Meissonic/predict.py
@@ -0,0 +1,105 @@
+# Prediction interface for Cog ⚙️
+# https://cog.run/python
+
+import os
+import subprocess
+import time
+import torch
+from transformers import (
+ CLIPTextModelWithProjection,
+ CLIPTokenizer,
+)
+from diffusers import VQModel
+from cog import BasePredictor, Input, Path
+
+from src.transformer import Transformer2DModel
+from src.pipeline import Pipeline
+from src.scheduler import Scheduler
+
+
+MODEL_CACHE = "model_cache"
+MODEL_URL = (
+ f"https://weights.replicate.delivery/default/viiika/Meissonic/{MODEL_CACHE}.tar"
+)
+
+os.environ.update(
+ {
+ "HF_DATASETS_OFFLINE": "1",
+ "TRANSFORMERS_OFFLINE": "1",
+ "HF_HOME": MODEL_CACHE,
+ "TORCH_HOME": MODEL_CACHE,
+ "HF_DATASETS_CACHE": MODEL_CACHE,
+ "TRANSFORMERS_CACHE": MODEL_CACHE,
+ "HUGGINGFACE_HUB_CACHE": MODEL_CACHE,
+ }
+)
+
+
+def download_weights(url, dest):
+ start = time.time()
+ print("downloading url: ", url)
+ print("downloading to: ", dest)
+ subprocess.check_call(["pget", "-x", url, dest], close_fds=False)
+ print("downloading took: ", time.time() - start)
+
+
+class Predictor(BasePredictor):
+ def setup(self) -> None:
+ """Load the model into memory to make running multiple predictions efficient"""
+
+ if not os.path.exists(MODEL_CACHE):
+ download_weights(MODEL_URL, MODEL_CACHE)
+
+ model_path = f"{MODEL_CACHE}/MeissonFlow/Meissonic"
+ model = Transformer2DModel.from_pretrained(model_path, subfolder="transformer")
+ vq_model = VQModel.from_pretrained(model_path, subfolder="vqvae")
+ text_encoder = CLIPTextModelWithProjection.from_pretrained( # more stable sampling for some cases
+ f"{MODEL_CACHE}/laion/CLIP-ViT-H-14-laion2B-s32B-b79K"
+ )
+ tokenizer = CLIPTokenizer.from_pretrained(model_path, subfolder="tokenizer")
+ scheduler = Scheduler.from_pretrained(model_path, subfolder="scheduler")
+ self.pipe = Pipeline(
+ vq_model,
+ tokenizer=tokenizer,
+ text_encoder=text_encoder,
+ transformer=model,
+ scheduler=scheduler,
+ ).to("cuda")
+
+ def predict(
+ self,
+ prompt: str = Input(
+ description="Input prompt",
+ default="a photo of an astronaut riding a horse on mars",
+ ),
+ negative_prompt: str = Input(
+ description="Specify things to not see in the output",
+ default="worst quality, low quality, low res, blurry, distortion, watermark, logo, signature, text, jpeg artifacts, signature, sketch, duplicate, ugly, identifying mark",
+ ),
+ num_inference_steps: int = Input(
+ description="Number of denoising steps", ge=1, le=100, default=64
+ ),
+ guidance_scale: float = Input(
+ description="Scale for classifier-free guidance", ge=0, le=20, default=9
+ ),
+ seed: int = Input(
+ description="Random seed. Leave blank to randomize the seed", default=None
+ ),
+ ) -> Path:
+ """Run a single prediction on the model"""
+ if seed is None:
+ seed = int.from_bytes(os.urandom(2), "big")
+ print(f"Using seed: {seed}")
+ torch.manual_seed(seed)
+
+ image = self.pipe(
+ prompt=prompt,
+ negative_prompt=negative_prompt,
+ height=1024,
+ width=1024,
+ guidance_scale=guidance_scale,
+ num_inference_steps=num_inference_steps,
+ ).images[0]
+ output_path = f"/tmp/out.png"
+ image.save(output_path)
+ return Path(output_path)
diff --git a/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV4x8x8/.gitattributes b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV4x8x8/.gitattributes
new file mode 100644
index 0000000000000000000000000000000000000000..14a69d3c1c50c140b42093ed67c383e51f98c237
--- /dev/null
+++ b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV4x8x8/.gitattributes
@@ -0,0 +1,38 @@
+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+autoencoder.jit filter=lfs diff=lfs merge=lfs -text
+decoder.jit filter=lfs diff=lfs merge=lfs -text
+encoder.jit filter=lfs diff=lfs merge=lfs -text
diff --git a/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV4x8x8/README.md b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV4x8x8/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..d7338fbf966ac341b466f5d1d10c0e62a67421b6
--- /dev/null
+++ b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV4x8x8/README.md
@@ -0,0 +1,326 @@
+---
+license: other
+license_name: nvidia-open-model-license
+license_link: >-
+ https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf
+library_name: nemo
+---
+# **Cosmos Tokenizer**: A suite of image and video tokenizers
+
+[**Website**](https://research.nvidia.com/labs/dir/cosmos-tokenizer) | [**Code**](https://github.com/NVIDIA/Cosmos-Tokenizer) | [**Video**](https://youtu.be/Soy_myOfWIU)
+
+
+# Model Overview
+
+## Description:
+**Cosmos Tokenizer** is a suite of visual tokenizers for images and videos that delivers various compression rates while maintaining high reconstruction quality. Cosmos Tokenizer can serve as an effective and efficient building block in both diffusion-based and autoregressive models for image and video generation.
+
+
+Our tokenizers come in two types: **Continuous** (C) and **Discrete** (D), each with **Image** (I) and **Video** (V) variants:
+* Continuous tokenizers encode visual data into continuous latent embeddings, as shown in latent diffusion models like [Stable Diffusion](https://github.com/CompVis/stable-diffusion). These embeddings are suitable for models that generate data by sampling from continuous distributions.
+* Discrete tokenizers encode visual data into discrete latent codes, mapping them into quantized indices, as seen in autoregressive transformers such as [VideoPoet](https://sites.research.google/videopoet/). This discretization is required for models that generate data by optimizing the cross-entropy loss, such as the GPT models.
+
+
+| | Continuous ( C ) | Discrete ( D ) |
+| ------------------|---------------------|---------------------|
+| **Images ( I )** | Cosmos-Tokenizer-CI | Cosmos-Tokenizer-DI |
+| **Videos ( V )** | Cosmos-Tokenizer-CV | Cosmos-Tokenizer-DV |
+
+
+Given an image or a video, Cosmos Tokenizer outputs either continuous latents or discrete tokens. Cosmos Tokenizer achieves spatial compression rates of 8x8 or 16x16 and temporal compression factors of 4x or 8x, resulting in a total compression factor of up to 2048x (=8x16x16). Cosmos Tokenizer delivers 8x more total compression than state-of-the-art (SOTA) methods while simultaneously maintaining higher image quality and running up to 12x faster than the best available SOTA tokenizers.
+
+**Model Developer**: NVIDIA
+
+## Model Versions
+
+The initial release (v1.0) of Cosmos Tokenizer includes the following tokenizers:
+* **Continuous Tokenizers**
+ * Continuous Image (CI) Tokenizer
+ * [Cosmos-Tokenizer-CI8x8](https://huggingface.co/nvidia/Cosmos-Tokenizer-CI8x8) (8x8 spatial compression)
+ * [Cosmos-Tokenizer-CI16x16](https://huggingface.co/nvidia/Cosmos-Tokenizer-CI16x16) (16x16 spatial compression)
+ * Continuous Video (CV) Tokenizer
+ * [Cosmos-Tokenizer-CV4x8x8](https://huggingface.co/nvidia/Cosmos-Tokenizer-CV4x8x8) (4x temporal compression, 8x8 spatial compression)
+ * [Cosmos-Tokenizer-CV8x8x8](https://huggingface.co/nvidia/Cosmos-Tokenizer-CV8x8x8) (8x temporal compression, 8x8 spatial compression)
+ * [Cosmos-Tokenizer-CV8x16x16](https://huggingface.co/nvidia/Cosmos-Tokenizer-CV8x16x16) (8x temporal compression, 16x16 spatial compression)
+* **Discrete Tokenizers**
+ * Discrete Image (DI) Tokenizer
+ * [Cosmos-Tokenizer-DI8x8](https://huggingface.co/nvidia/Cosmos-Tokenizer-DI8x8) (8x8 spatial compression)
+ * [Cosmos-Tokenizer-DI16x16](https://huggingface.co/nvidia/Cosmos-Tokenizer-DI16x16) (16x16 spatial compression)
+ * Discrete Video (DV) Tokenizer
+ * [Cosmos-Tokenizer-DV4x8x8](https://huggingface.co/nvidia/Cosmos-Tokenizer-DV4x8x8) (4x temporal compression, 8x8 spatial compression)
+ * [Cosmos-Tokenizer-DV8x8x8](https://huggingface.co/nvidia/Cosmos-Tokenizer-DV8x8x8) (8x temporal compression, 8x8 spatial compression)
+ * [Cosmos-Tokenizer-DV8x16x16](https://huggingface.co/nvidia/Cosmos-Tokenizer-DV8x16x16) (8x temporal compression, 16x16 spatial compression)
+
+
+### License/Terms of Use:
+[NVIDIA Open Model License](https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf)
+
+Under the NVIDIA Open Model License, NVIDIA confirms:
+
+* Models are commercially usable.
+* You are free to create and distribute Derivative Models.
+* NVIDIA does not claim ownership to any outputs generated using the Models or Derivative Models.
+
+## Model Architecture:
+
+We designed Cosmos Tokenizer using a lightweight and computationally efficient architecture, featuring a temporally causal design. Specifically, we employ causal temporal convolution and causal temporal attention layers to preserve the natural temporal order of video frames, ensuring seamless tokenization of images and videos using a single unified network architecture. The encoder and decoder form a symmetrical pair, which are mirrors of each other. The encoder starts with a 2-level [Haar wavelet](https://link.springer.com/book/10.1007/978-3-319-04295-4) transform layer, which down-samples inputs by a factor of 4 in both spatial and temporal dimensions. Likewise, the decoder ends with an inverse wavelet transform. We employ the vanilla autoencoder (AE) formulation to model the latent space for continuous tokenizers. For discrete tokenizers, we adopt the [Finite-Scalar-Quantization](https://openreview.net/forum?id=8ishA3LxN8) (FSQ) as the latent space quantizer.
+
+
+
+
+
+## Input/Output Specifications
+
+### Encoder
+* **Input**
+ * **Types:** Images or Videos
+ * **Format:** RGB (Red, Green, Blue)
+ * **Resolution:**
+ * Minimum: 256px (shorter side)
+ * Maximum: Up to 4K
+ * **Video Length:** Up to 8 seconds for 1080p videos (bounded by A100 80G GPU memory; higher resolutions will have shorter supported durations)
+
+* **Output**
+ * **Types:** Tokens
+ * Continuous Image/Video Tokenizers: Continuous value feature vectors
+ * Discrete Image/Video Tokenizers: Integer indices
+
+### Decoder
+* **Input**
+ * **Types:** Tokens from encoder
+
+* **Output**
+ * **Types:** Images or Videos (matching input type)
+ * **Format:** RGB (Red, Green, Blue)
+ * **Resolution:** Same as input resolution
+ * **Video Length:** Same as input video length
+
+## Software Integration (Required For NVIDIA Models Only):
+**Runtime Engine(s):**
+* [Cosmos-Tokenizer](https://github.com/NVIDIA/Cosmos-Tokenizer)
+* [NeMo](https://github.com/NVIDIA/NeMo) (please install the latest version from the GitHub main branch)
+
+**Supported Hardware Microarchitecture Compatibility:**
+* NVIDIA Ampere (e.g., A100)
+* NVIDIA Hopper (e.g., H100)
+
+Note: We have only tested Cosmos Tokenizer with BF16 precision on Ampere and Hopper GPUs. If you are using older versions of NVIDIA GPUs (e.g., NVIDIA Volta GPUs), you may need to switch to FP32 precision.
+
+
+**Operating System(s):**
+* Linux (We have not tested on other operating systems.)
+
+# Usage
+Inference Engines:
+* [Cosmos-Tokenizer](https://github.com/NVIDIA/Cosmos-Tokenizer) (PyTorch)
+* [NeMo](https://github.com/NVIDIA/NeMo)
+
+## Inference with `Cosmos-Tokenizer` (PyTorch)
+### Step-1: Installation of `Cosmos-Tokenizer`
+Note: Currently, the `Cosmos-Tokenizer` code is only supported on Linux.
+
+- Please clone the `Cosmos-Tokenizer` from GitHub repo [github.com/NVIDIA/Cosmos-Tokenizer](https://github.com/NVIDIA/Cosmos-Tokenizer).
+
+ ```bash
+ git clone https://github.com/NVIDIA/Cosmos-Tokenizer.git
+ cd Cosmos-Tokenizer
+ ```
+- Install dependencies
+
+ ```bash
+ pip3 install -r requirements.txt
+ apt-get install -y ffmpeg
+ ```
+
+- Preferably, you could build a docker image using our provided Dockerfile.
+ ```bash
+ docker build -t cosmos-docker -f Dockerfile.
+ # You can run the container as:
+ docker run --gpus all -it --rm -v /home/${USER}:/home/${USER} \
+ --workdir ${PWD} cosmos-docker /bin/bash
+ ```
+
+### Step-2: Download Pre-trained Checkpoints
+- Create a local directory for the pre-trained checkpoints and download the
+pre-trained checkpoints from HuggingFace.
+
+ ```python
+ from huggingface_hub import login, snapshot_download
+ import os
+ # You could get your Hugging Face token from https://huggingface.co/settings/tokens
+ login(token=, add_to_git_credential=True)
+ # You could specify the tokenizers you want to download.
+ model_names = [
+ "Cosmos-Tokenizer-CI8x8",
+ "Cosmos-Tokenizer-CI16x16",
+ "Cosmos-Tokenizer-CV4x8x8",
+ "Cosmos-Tokenizer-CV8x8x8",
+ "Cosmos-Tokenizer-CV8x16x16",
+ "Cosmos-Tokenizer-DI8x8",
+ "Cosmos-Tokenizer-DI16x16",
+ "Cosmos-Tokenizer-DV4x8x8",
+ "Cosmos-Tokenizer-DV8x8x8",
+ "Cosmos-Tokenizer-DV8x16x16",
+ ]
+ for model_name in model_names:
+ hf_repo = "nvidia/" + model_name
+ local_dir = "pretrained_ckpts/" + model_name
+ os.makedirs(local_dir, exist_ok=True)
+ print(f"downloading {model_name} to {local_dir}...")
+ snapshot_download(repo_id=hf_repo, local_dir=local_dir)
+ ```
+
+- Under the ech checkpoint directory `pretrained_ckpts/`, we provide the encoder,
+decoder and the full autoencoder JIT models.
+
+ ```bash
+ ├── pretrained_ckpts/
+ │ ├── Cosmos-Tokenizer-DV8x8x8/
+ │ │ ├── encoder.jit
+ │ │ ├── decoder.jit
+ │ │ ├── autoencoder.jit
+ │ ...
+ ```
+
+### Step-3: Run Inference
+You can use the following example commands to encode and decode images or videos. For each, the same command works for both continuous and discrete tokenization. Simply provide the proper JIT-compiled ckpt to `checkpoint_enc`, `checkpoint_dec`, or the full autoencoder ckpt to `checkpoint`.
+
+```python
+import torch
+from cosmos_tokenizer.video_lib import CausalVideoTokenizer
+model_name = "Cosmos-Tokenizer-DV4x8x8"
+input_tensor = torch.randn(1, 3, 9, 512, 512).to('cuda').to(torch.bfloat16)
+encoder = CausalVideoTokenizer(checkpoint_enc=f'pretrained_ckpts/{model_name}/encoder.jit')
+(indices, codes) = encoder.encode(input_tensor)
+torch.testing.assert_close(indices.shape, (1, 3, 64, 64))
+torch.testing.assert_close(codes.shape, (1, 6, 3, 64, 64))
+
+# The input tensor can be reconstructed by the decoder as:
+decoder = CausalVideoTokenizer(checkpoint_dec=f'pretrained_ckpts/{model_name}/decoder.jit')
+reconstructed_tensor = decoder.decode(indices)
+torch.testing.assert_close(reconstructed_tensor.shape, input_tensor.shape)
+```
+
+The `indices` will have the shape `(1, 3, 64, 64)` and contain integral values in the range `[1..64K]`, where the first of the three integral maps represents the first frame.
+The `codes` will contain the pre-quantization continuous latent with shape `(1, 6, 3, 64, 64)`, where C=6 represents the number of FSQ levels.
+
+**Note**: More inference usage commands, including both TorchScript (JIT) and PyTorch Inference APIs on real images and videos, can be found on our GitHub repository [github.com/NVIDIA/Cosmos-Tokenizer](https://github.com/NVIDIA/Cosmos-Tokenizer).
+
+
+## Inference with NeMo
+
+### Step-1: Install NeMo
+Please install NeMo from the GitHub `main` branch following the instructions [here](https://github.com/NVIDIA/NeMo?tab=readme-ov-file#pip-from-a-source-branch).
+
+### Step-2: Run Inference
+Run the following code to tokenize the video:
+
+```python
+import torch
+from nemo.collections.common.video_tokenizers.cosmos_vision_tokenizer import CausalVideoTokenizer
+model_name = "Cosmos-Tokenizer-DV4x8x8"
+model = CausalVideoTokenizer.from_pretrained(model_name)
+input_tensor = torch.randn(1, 3, 9, 512, 512).to('cuda').to(torch.bfloat16)
+(indices, codes) = model.encode(input_tensor)
+```
+Please see the [Cosmos Tokenizer README within the NeMo repository](https://github.com/NVIDIA/NeMo/tree/main/nemo/collections/common/video_tokenizers) for additional examples to create training datasets with the Cosmos Tokenizer.
+
+
+# Evaluation
+
+## TokenizationPerformance Comparison
+We have extensively evaluated the **Cosmos Tokenizer** suite on various image and video benchmark datasets. In addition to commonly used datasets such as [MS-COCO](https://cocodataset.org/#home) and [DAVIS](https://davischallenge.org/), in order to cover a wide variety of visual data and standardize the evaluation, we created a benchmark called [TokenBench](https://github.com/NVlabs/Token-Bench), which is a mixed sampling of video data from diverse domains.
+
+| Tokenizer | Compression Ratio | Quantization | PSNR (DAVIS) | SSIM (DAVIS) | rFVD (DAVIS) | PSNR (TokenBench) | SSIM (TokenBench) | rFVD (TokenBench) |
+|-----------|------------------|--------------|--------------|--------------|--------------|------------------|------------------|------------------|
+| VideoGPT | 4×4×4 | VQ | 32.23 | **0.850** | 72.33 | 35.11 | **0.914** | **13.85** |
+| Omnitokenizer | 4×8×8 | VQ | 28.44 | 0.712 | 188.60 | 30.15 | 0.827 | 53.55 |
+| Cosmos-Tokenizer-DV | 4×8×8 | FSQ | **32.98** | 0.818 | **37.36** | **35.13** | 0.887 | 19.67 |
+| Cosmos-Tokenizer-DV | 8×8×8 | FSQ | 32.11 | 0.775 | 100.15 | 34.74 | 0.872 | 43.86 |
+| Cosmos-Tokenizer-DV | 8×16×16 | FSQ | 31.42 | 0.716 | 241.52 | 33.71 | 0.828 | 113.48 |
+
+* We compare with the state-of-the-art discrete video tokenizer, [OmniTokenizer](https://github.com/FoundationVision/OmniTokenizer).
+* Evaluation metrics:
+ * Peak Signal-to-Noise Ratio (PSNR)
+ * Structural Similarity (SSIM)
+ * Reconstruction Fréchet Video Distance (rFVD)
+
+## Runtime Comparison
+
+The following table shows the number of parameters and the averaged encoding and decoding times per image or video frame, measured on a single A100 80GB GPU. For comparison, we also list the parameters and average speeds of prior state-of-the-art tokenizer(s) with the same compression ratio.
+
+| Tokenizer | Resolution | Compression Ratio | Parameters | Time (ms) |
+|----------------|------------|-------------------|------------|-----------|
+| OmniTokenizer | 720x1280 | 4×8×8 | 54M | 53.2 |
+| Cosmos-DV | 720x1280 | 4×8×8 | 105M | 51.5 |
+
+Note: We benchmarked the runtime for images under the 8x8 compression and videos under the 4×8×8 compression. Tokenizers with different compression ratios are not included in this comparison.
+
+## Ethical Considerations
+NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
+
+For more detailed information on ethical considerations for this model, please see the subcards of Explainability, Bias, Safety & Security, and Privacy below. Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).
+
+### Bias
+
+Field | Response
+:---------------------------------------------------------------------------------------------------|:---------------
+Participation considerations from adversely impacted groups [protected classes](https://www.senate.ca.gov/content/protected-classes) in model design and testing: | None
+Measures taken to mitigate against unwanted bias: | None
+
+
+### Explainability
+
+Field | Response
+:------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------
+Intended Application & Domain: | Tokenization of images and videos
+Model Type: | Auto-Encoder
+Intended Users: | Generative AI developers for image and video generation models
+Output: | Images/Videos and Latent Tokens
+Describe how the model works: | Compresses and decompresses visual input (image/video).
+Technical Limitations: | Due to tokenizer compression limitations, some visual information (such as small text and other structured fine details) may not be reconstructed accurately.
+Verified to have met prescribed NVIDIA quality standards: | Yes
+Performance Metrics: | Peak Signal-to-Noise Ratio (PSNR), Structural Similarity (SSIM), Reconstruction Fréchet Video Distance (rFVD), Reconstruction Fréchet Inception Distance (rFID), Latency
+Potential Known Risks: | Tokenizer's output can parse all forms of input, including what may be considered toxic, offensive, or indecent.
+Licensing: | [NVIDIA Open Model License](https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf)
+
+
+### Privacy
+Field | Response
+:----------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------
+Generatable or reverse engineerable personal information? | No
+Protected class data used to create this model? | None Known
+Was consent obtained for any personal data used? | None Known
+How often is dataset reviewed? | Before Release
+Is a mechanism in place to honor data subject right of access or deletion of personal data? | Not Applicable
+If personal collected for the development of the model, was it collected directly by NVIDIA? | Not Applicable
+If personal collected for the development of the model by NVIDIA, do you maintain or have access to disclosures made to data subjects? | Not Applicable
+If personal collected for the development of this AI model, was it minimized to only what was required? | Not Applicable
+Is there provenance for all datasets used in training? | Yes
+Does data labeling (annotation, metadata) comply with privacy laws? | Yes
+Is data compliant with data subject requests for data correction or removal, if such a request was made? | Not Applicable
+
+### Safety
+
+Field | Response
+:---------------------------------------------------|:----------------------------------
+Model Application(s): | Tokenization of images and videos
+Describe the life critical impact (if present). | None Known
+Use Case Restrictions: | See [NVIDIA Open Model License](https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf)
+Model and dataset restrictions: | The Principle of least privilege (PoLP) is applied limiting access for dataset generation and model development. Restrictions enforce dataset access during training, and dataset license constraints adhered to. Model checkpoints are made available on Hugging Face, and may become available on cloud providers' model catalog.
+
+
+### Plus Plus (++) Promise
+
+We value you, the datasets, the diversity they represent, and what we have been entrusted with. This model and its associated data have been:
+* Verified to comply with current applicable disclosure laws, regulations, and industry standards.
+* Verified to comply with applicable privacy labeling requirements.
+* Annotated to describe the collector/source (NVIDIA or a third-party).
+* Characterized for technical limitations.
+* Reviewed to ensure proper disclosure is accessible to, maintained for, and in compliance with NVIDIA data subjects and their requests.
+* Reviewed before release.
+* Tagged for known restrictions and potential safety implications.
+
+
+# Core Contributors
+Fitsum Reda, Jinwei Gu, Xian Liu, Songwei Ge, Ting-Chun Wang, Haoxiang Wang, Ming-Yu Liu
\ No newline at end of file
diff --git a/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV4x8x8/autoencoder.jit b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV4x8x8/autoencoder.jit
new file mode 100644
index 0000000000000000000000000000000000000000..79301739232c4699b33ff1489a7c8efd636b34f8
--- /dev/null
+++ b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV4x8x8/autoencoder.jit
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:eea104b84fee21d170fb20f99027c076ffd97e37b8d43a6a8f6135a2a61cfaf1
+size 211093069
diff --git a/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV4x8x8/config.json b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV4x8x8/config.json
new file mode 100644
index 0000000000000000000000000000000000000000..bcad561de5279b772db7dd4b76b11d07ddc7ced1
--- /dev/null
+++ b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV4x8x8/config.json
@@ -0,0 +1,6 @@
+{
+ "architectures": [
+ "CosmosTokenizer"
+ ],
+}
+
\ No newline at end of file
diff --git a/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV4x8x8/decoder.jit b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV4x8x8/decoder.jit
new file mode 100644
index 0000000000000000000000000000000000000000..1183024939ae16ac34b3fc29c0d3b61223aa7ced
--- /dev/null
+++ b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV4x8x8/decoder.jit
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a6b82dd6f4d489bbeb728e54c828d5a676f17e6eba9b9dfe2dc7839928bee73f
+size 125210440
diff --git a/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV4x8x8/encoder.jit b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV4x8x8/encoder.jit
new file mode 100644
index 0000000000000000000000000000000000000000..0992f10de6fafd4e4659bc8277cb958236045a0f
--- /dev/null
+++ b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV4x8x8/encoder.jit
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9a0e8459ab5e0ecfd0c00f215571de43e368f090c16adeb1a69fa835177bdea6
+size 86641076
diff --git a/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV4x8x8/model_config.yaml b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV4x8x8/model_config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..5be0900a5551d62cb295248f76186f2f665c51d0
--- /dev/null
+++ b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV4x8x8/model_config.yaml
@@ -0,0 +1 @@
+nemo_version: https://github.com/NVIDIA/NeMo/commit/6a5d4b5d19e05262a4182a83613753d424153a8f
\ No newline at end of file
diff --git a/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV8x8x8/.gitattributes b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV8x8x8/.gitattributes
new file mode 100644
index 0000000000000000000000000000000000000000..14a69d3c1c50c140b42093ed67c383e51f98c237
--- /dev/null
+++ b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV8x8x8/.gitattributes
@@ -0,0 +1,38 @@
+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+autoencoder.jit filter=lfs diff=lfs merge=lfs -text
+decoder.jit filter=lfs diff=lfs merge=lfs -text
+encoder.jit filter=lfs diff=lfs merge=lfs -text
diff --git a/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV8x8x8/README.md b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV8x8x8/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..c12a96ecf56e164f03b8f41e6b1208849d393c3f
--- /dev/null
+++ b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV8x8x8/README.md
@@ -0,0 +1,325 @@
+---
+license: other
+license_name: nvidia-open-model-license
+license_link: >-
+ https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf
+library_name: nemo
+---
+# **Cosmos Tokenizer**: A suite of image and video tokenizers
+
+[**Website**](https://research.nvidia.com/labs/dir/cosmos-tokenizer) | [**Code**](https://github.com/NVIDIA/Cosmos-Tokenizer) | **Video**
+
+
+# Model Overview
+
+## Description:
+**Cosmos Tokenizer** is a suite of visual tokenizers for images and videos that delivers various compression rates while maintaining high reconstruction quality. Cosmos Tokenizer can serve as an effective and efficient building block in both diffusion-based and autoregressive models for image and video generation.
+
+
+Our tokenizers come in two types: **Continuous** (C) and **Discrete** (D), each with **Image** (I) and **Video** (V) variants:
+* Continuous tokenizers encode visual data into continuous latent embeddings, as shown in latent diffusion models like [Stable Diffusion](https://github.com/CompVis/stable-diffusion). These embeddings are suitable for models that generate data by sampling from continuous distributions.
+* Discrete tokenizers encode visual data into discrete latent codes, mapping them into quantized indices, as seen in autoregressive transformers such as [VideoPoet](https://sites.research.google/videopoet/). This discretization is required for models that generate data by optimizing the cross-entropy loss, such as the GPT models.
+
+
+| | Continuous ( C ) | Discrete ( D ) |
+| ------------------|---------------------|---------------------|
+| **Images ( I )** | Cosmos-Tokenizer-CI | Cosmos-Tokenizer-DI |
+| **Videos ( V )** | Cosmos-Tokenizer-CV | Cosmos-Tokenizer-DV |
+
+
+Given an image or a video, Cosmos Tokenizer outputs either continuous latents or discrete tokens. Cosmos Tokenizer achieves spatial compression rates of 8x8 or 16x16 and temporal compression factors of 4x or 8x, resulting in a total compression factor of up to 2048x (=8x16x16). Cosmos Tokenizer delivers 8x more total compression than state-of-the-art (SOTA) methods while simultaneously maintaining higher image quality and running up to 12x faster than the best available SOTA tokenizers.
+
+**Model Developer**: NVIDIA
+
+## Model Versions
+
+The initial release (v1.0) of Cosmos Tokenizer includes the following tokenizers:
+* **Continuous Tokenizers**
+ * Continuous Image (CI) Tokenizer
+ * [Cosmos-Tokenizer-CI8x8](https://huggingface.co/nvidia/Cosmos-Tokenizer-CI8x8) (8x8 spatial compression)
+ * [Cosmos-Tokenizer-CI16x16](https://huggingface.co/nvidia/Cosmos-Tokenizer-CI16x16) (16x16 spatial compression)
+ * Continuous Video (CV) Tokenizer
+ * [Cosmos-Tokenizer-CV4x8x8](https://huggingface.co/nvidia/Cosmos-Tokenizer-CV4x8x8) (4x temporal compression, 8x8 spatial compression)
+ * [Cosmos-Tokenizer-CV8x8x8](https://huggingface.co/nvidia/Cosmos-Tokenizer-CV8x8x8) (8x temporal compression, 8x8 spatial compression)
+ * [Cosmos-Tokenizer-CV8x16x16](https://huggingface.co/nvidia/Cosmos-Tokenizer-CV8x16x16) (8x temporal compression, 16x16 spatial compression)
+* **Discrete Tokenizers**
+ * Discrete Image (DI) Tokenizer
+ * [Cosmos-Tokenizer-DI8x8](https://huggingface.co/nvidia/Cosmos-Tokenizer-DI8x8) (8x8 spatial compression)
+ * [Cosmos-Tokenizer-DI16x16](https://huggingface.co/nvidia/Cosmos-Tokenizer-DI16x16) (16x16 spatial compression)
+ * Discrete Video (DV) Tokenizer
+ * [Cosmos-Tokenizer-DV4x8x8](https://huggingface.co/nvidia/Cosmos-Tokenizer-DV4x8x8) (4x temporal compression, 8x8 spatial compression)
+ * [Cosmos-Tokenizer-DV8x8x8](https://huggingface.co/nvidia/Cosmos-Tokenizer-DV8x8x8) (8x temporal compression, 8x8 spatial compression)
+ * [Cosmos-Tokenizer-DV8x16x16](https://huggingface.co/nvidia/Cosmos-Tokenizer-DV8x16x16) (8x temporal compression, 16x16 spatial compression)
+
+
+### License/Terms of Use:
+[NVIDIA Open Model License](https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf)
+
+Under the NVIDIA Open Model License, NVIDIA confirms:
+
+* Models are commercially usable.
+* You are free to create and distribute Derivative Models.
+* NVIDIA does not claim ownership to any outputs generated using the Models or Derivative Models.
+
+## Model Architecture:
+
+We designed Cosmos Tokenizer using a lightweight and computationally efficient architecture, featuring a temporally causal design. Specifically, we employ causal temporal convolution and causal temporal attention layers to preserve the natural temporal order of video frames, ensuring seamless tokenization of images and videos using a single unified network architecture. The encoder and decoder form a symmetrical pair, which are mirrors of each other. The encoder starts with a 2-level [Haar wavelet](https://link.springer.com/book/10.1007/978-3-319-04295-4) transform layer, which down-samples inputs by a factor of 4 in both spatial and temporal dimensions. Likewise, the decoder ends with an inverse wavelet transform. We employ the vanilla autoencoder (AE) formulation to model the latent space for continuous tokenizers. For discrete tokenizers, we adopt the [Finite-Scalar-Quantization](https://openreview.net/forum?id=8ishA3LxN8) (FSQ) as the latent space quantizer.
+
+
+
+
+
+## Input/Output Specifications
+
+### Encoder
+* **Input**
+ * **Types:** Images or Videos
+ * **Format:** RGB (Red, Green, Blue)
+ * **Resolution:**
+ * Minimum: 256px (shorter side)
+ * Maximum: Up to 4K
+ * **Video Length:** Up to 8 seconds for 1080p videos (bounded by A100 80G GPU memory; higher resolutions will have shorter supported durations)
+
+* **Output**
+ * **Types:** Tokens
+ * Continuous Image/Video Tokenizers: Continuous value feature vectors
+ * Discrete Image/Video Tokenizers: Integer indices
+
+### Decoder
+* **Input**
+ * **Types:** Tokens from encoder
+
+* **Output**
+ * **Types:** Images or Videos (matching input type)
+ * **Format:** RGB (Red, Green, Blue)
+ * **Resolution:** Same as input resolution
+ * **Video Length:** Same as input video length
+
+## Software Integration (Required For NVIDIA Models Only):
+**Runtime Engine(s):**
+* [Cosmos-Tokenizer](https://github.com/NVIDIA/Cosmos-Tokenizer)
+* [NeMo](https://github.com/NVIDIA/NeMo) (please install the latest version from the GitHub main branch)
+
+**Supported Hardware Microarchitecture Compatibility:**
+* NVIDIA Ampere (e.g., A100)
+* NVIDIA Hopper (e.g., H100)
+
+Note: We have only tested Cosmos Tokenizer with BF16 precision on Ampere and Hopper GPUs. If you are using older versions of NVIDIA GPUs (e.g., NVIDIA Volta GPUs), you may need to switch to FP32 precision.
+
+
+**Operating System(s):**
+* Linux (We have not tested on other operating systems.)
+
+# Usage
+Inference Engines:
+* [Cosmos-Tokenizer](https://github.com/NVIDIA/Cosmos-Tokenizer) (PyTorch)
+* [NeMo](https://github.com/NVIDIA/NeMo)
+
+## Inference with `Cosmos-Tokenizer` (PyTorch)
+### Step-1: Installation of `Cosmos-Tokenizer`
+Note: Currently, the `Cosmos-Tokenizer` code is only supported on Linux.
+
+- Please clone the `Cosmos-Tokenizer` from GitHub repo [github.com/NVIDIA/Cosmos-Tokenizer](https://github.com/NVIDIA/Cosmos-Tokenizer).
+
+ ```bash
+ git clone https://github.com/NVIDIA/Cosmos-Tokenizer.git
+ cd Cosmos-Tokenizer
+ ```
+- Install dependencies
+
+ ```bash
+ pip3 install -r requirements.txt
+ apt-get install -y ffmpeg
+ ```
+
+- Preferably, you could build a docker image using our provided Dockerfile.
+ ```bash
+ docker build -t cosmos-docker -f Dockerfile.
+ # You can run the container as:
+ docker run --gpus all -it --rm -v /home/${USER}:/home/${USER} \
+ --workdir ${PWD} cosmos-docker /bin/bash
+ ```
+
+### Step-2: Download Pre-trained Checkpoints
+- Create a local directory for the pre-trained checkpoints and download the
+pre-trained checkpoints from HuggingFace.
+
+ ```python
+ from huggingface_hub import login, snapshot_download
+ import os
+ # You could get your Hugging Face token from https://huggingface.co/settings/tokens
+ login(token=, add_to_git_credential=True)
+ # You could specify the tokenizers you want to download.
+ model_names = [
+ "Cosmos-Tokenizer-CI8x8",
+ "Cosmos-Tokenizer-CI16x16",
+ "Cosmos-Tokenizer-CV4x8x8",
+ "Cosmos-Tokenizer-CV8x8x8",
+ "Cosmos-Tokenizer-CV8x16x16",
+ "Cosmos-Tokenizer-DI8x8",
+ "Cosmos-Tokenizer-DI16x16",
+ "Cosmos-Tokenizer-DV4x8x8",
+ "Cosmos-Tokenizer-DV8x8x8",
+ "Cosmos-Tokenizer-DV8x16x16",
+ ]
+ for model_name in model_names:
+ hf_repo = "nvidia/" + model_name
+ local_dir = "pretrained_ckpts/" + model_name
+ os.makedirs(local_dir, exist_ok=True)
+ print(f"downloading {model_name} to {local_dir}...")
+ snapshot_download(repo_id=hf_repo, local_dir=local_dir)
+ ```
+
+- Under the ech checkpoint directory `pretrained_ckpts/`, we provide the encoder,
+decoder and the full autoencoder JIT models.
+
+ ```bash
+ ├── pretrained_ckpts/
+ │ ├── Cosmos-Tokenizer-DV8x8x8/
+ │ │ ├── encoder.jit
+ │ │ ├── decoder.jit
+ │ │ ├── autoencoder.jit
+ │ ...
+ ```
+
+### Step-3: Run Inference
+You can use the following example commands to encode and decode images or videos. For each, the same command works for both continuous and discrete tokenization. Simply provide the proper JIT-compiled ckpt to `checkpoint_enc`, `checkpoint_dec`, or the full autoencoder ckpt to `checkpoint`.
+
+```python
+import torch
+from cosmos_tokenizer.video_lib import CausalVideoTokenizer
+model_name = "Cosmos-Tokenizer-DV4x8x8"
+input_tensor = torch.randn(1, 3, 9, 512, 512).to('cuda').to(torch.bfloat16)
+encoder = CausalVideoTokenizer(checkpoint_enc=f'pretrained_ckpts/{model_name}/encoder.jit')
+(indices, codes) = encoder.encode(input_tensor)
+torch.testing.assert_close(indices.shape, (1, 3, 64, 64))
+torch.testing.assert_close(codes.shape, (1, 6, 3, 64, 64))
+
+# The input tensor can be reconstructed by the decoder as:
+decoder = CausalVideoTokenizer(checkpoint_dec=f'pretrained_ckpts/{model_name}/decoder.jit')
+reconstructed_tensor = decoder.decode(indices)
+torch.testing.assert_close(reconstructed_tensor.shape, input_tensor.shape)
+```
+
+The `indices` will have the shape `(1, 3, 64, 64)` and contain integral values in the range `[1..64K]`, where the first of the three integral maps represents the first frame.
+The `codes` will contain the pre-quantization continuous latent with shape `(1, 6, 3, 64, 64)`, where C=6 represents the number of FSQ levels.
+
+**Note**: More inference usage commands, including both TorchScript (JIT) and PyTorch Inference APIs on real images and videos, can be found on our GitHub repository [github.com/NVIDIA/Cosmos-Tokenizer](https://github.com/NVIDIA/Cosmos-Tokenizer).
+
+
+## Inference with NeMo
+
+### Step-1: Install NeMo
+Please install NeMo from the GitHub `main` branch following the instructions [here](https://github.com/NVIDIA/NeMo?tab=readme-ov-file#pip-from-a-source-branch).
+
+### Step-2: Run Inference
+Run the following code to tokenize the video:
+
+```python
+import torch
+from nemo.collections.common.video_tokenizers.cosmos_vision_tokenizer import CausalVideoTokenizer
+model_name = "Cosmos-Tokenizer-DV4x8x8"
+model = CausalVideoTokenizer.from_pretrained(model_name)
+input_tensor = torch.randn(1, 3, 9, 512, 512).to('cuda').to(torch.bfloat16)
+(indices, codes) = model.encode(input_tensor)
+```
+Please see the [Cosmos Tokenizer README within the NeMo repository](https://github.com/NVIDIA/NeMo/tree/main/nemo/collections/common/video_tokenizers) for additional examples to create training datasets with the Cosmos Tokenizer.
+
+# Evaluation
+
+## TokenizationPerformance Comparison
+We have extensively evaluated the **Cosmos Tokenizer** suite on various image and video benchmark datasets. In addition to commonly used datasets such as [MS-COCO](https://cocodataset.org/#home) and [DAVIS](https://davischallenge.org/), in order to cover a wide variety of visual data and standardize the evaluation, we created a benchmark called [TokenBench](https://github.com/NVlabs/Token-Bench), which is a mixed sampling of video data from diverse domains.
+
+| Tokenizer | Compression Ratio | Quantization | PSNR (DAVIS) | SSIM (DAVIS) | rFVD (DAVIS) | PSNR (TokenBench) | SSIM (TokenBench) | rFVD (TokenBench) |
+|-----------|------------------|--------------|--------------|--------------|--------------|------------------|------------------|------------------|
+| VideoGPT | 4×4×4 | VQ | 32.23 | **0.850** | 72.33 | 35.11 | **0.914** | **13.85** |
+| Omnitokenizer | 4×8×8 | VQ | 28.44 | 0.712 | 188.60 | 30.15 | 0.827 | 53.55 |
+| Cosmos-Tokenizer-DV | 4×8×8 | FSQ | **32.98** | 0.818 | **37.36** | **35.13** | 0.887 | 19.67 |
+| Cosmos-Tokenizer-DV | 8×8×8 | FSQ | 32.11 | 0.775 | 100.15 | 34.74 | 0.872 | 43.86 |
+| Cosmos-Tokenizer-DV | 8×16×16 | FSQ | 31.42 | 0.716 | 241.52 | 33.71 | 0.828 | 113.48 |
+
+* We compare with the state-of-the-art discrete video tokenizer, [OmniTokenizer](https://github.com/FoundationVision/OmniTokenizer).
+* Evaluation metrics:
+ * Peak Signal-to-Noise Ratio (PSNR)
+ * Structural Similarity (SSIM)
+ * Reconstruction Fréchet Video Distance (rFVD)
+
+## Runtime Comparison
+
+The following table shows the number of parameters and the averaged encoding and decoding times per image or video frame, measured on a single A100 80GB GPU. For comparison, we also list the parameters and average speeds of prior state-of-the-art tokenizer(s) with the same compression ratio.
+
+| Tokenizer | Resolution | Compression Ratio | Parameters | Time (ms) |
+|----------------|------------|-------------------|------------|-----------|
+| OmniTokenizer | 720x1280 | 4×8×8 | 54M | 53.2 |
+| Cosmos-DV | 720x1280 | 4×8×8 | 105M | 51.5 |
+
+Note: We benchmarked the runtime for images under the 8x8 compression and videos under the 4×8×8 compression. Tokenizers with different compression ratios are not included in this comparison.
+
+## Ethical Considerations
+NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
+
+For more detailed information on ethical considerations for this model, please see the subcards of Explainability, Bias, Safety & Security, and Privacy below. Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).
+
+### Bias
+
+Field | Response
+:---------------------------------------------------------------------------------------------------|:---------------
+Participation considerations from adversely impacted groups [protected classes](https://www.senate.ca.gov/content/protected-classes) in model design and testing: | None
+Measures taken to mitigate against unwanted bias: | None
+
+
+### Explainability
+
+Field | Response
+:------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------
+Intended Application & Domain: | Tokenization of images and videos
+Model Type: | Auto-Encoder
+Intended Users: | Generative AI developers for image and video generation models
+Output: | Images/Videos and Latent Tokens
+Describe how the model works: | Compresses and decompresses visual input (image/video).
+Technical Limitations: | Due to tokenizer compression limitations, some visual information (such as small text and other structured fine details) may not be reconstructed accurately.
+Verified to have met prescribed NVIDIA quality standards: | Yes
+Performance Metrics: | Peak Signal-to-Noise Ratio (PSNR), Structural Similarity (SSIM), Reconstruction Fréchet Video Distance (rFVD), Reconstruction Fréchet Inception Distance (rFID), Latency
+Potential Known Risks: | Tokenizer's output can parse all forms of input, including what may be considered toxic, offensive, or indecent.
+Licensing: | [NVIDIA Open Model License](https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf)
+
+
+### Privacy
+Field | Response
+:----------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------
+Generatable or reverse engineerable personal information? | No
+Protected class data used to create this model? | None Known
+Was consent obtained for any personal data used? | None Known
+How often is dataset reviewed? | Before Release
+Is a mechanism in place to honor data subject right of access or deletion of personal data? | Not Applicable
+If personal collected for the development of the model, was it collected directly by NVIDIA? | Not Applicable
+If personal collected for the development of the model by NVIDIA, do you maintain or have access to disclosures made to data subjects? | Not Applicable
+If personal collected for the development of this AI model, was it minimized to only what was required? | Not Applicable
+Is there provenance for all datasets used in training? | Yes
+Does data labeling (annotation, metadata) comply with privacy laws? | Yes
+Is data compliant with data subject requests for data correction or removal, if such a request was made? | Not Applicable
+
+### Safety
+
+Field | Response
+:---------------------------------------------------|:----------------------------------
+Model Application(s): | Tokenization of images and videos
+Describe the life critical impact (if present). | None Known
+Use Case Restrictions: | See [NVIDIA Open Model License](https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf)
+Model and dataset restrictions: | The Principle of least privilege (PoLP) is applied limiting access for dataset generation and model development. Restrictions enforce dataset access during training, and dataset license constraints adhered to. Model checkpoints are made available on Hugging Face, and may become available on cloud providers' model catalog.
+
+
+### Plus Plus (++) Promise
+
+We value you, the datasets, the diversity they represent, and what we have been entrusted with. This model and its associated data have been:
+* Verified to comply with current applicable disclosure laws, regulations, and industry standards.
+* Verified to comply with applicable privacy labeling requirements.
+* Annotated to describe the collector/source (NVIDIA or a third-party).
+* Characterized for technical limitations.
+* Reviewed to ensure proper disclosure is accessible to, maintained for, and in compliance with NVIDIA data subjects and their requests.
+* Reviewed before release.
+* Tagged for known restrictions and potential safety implications.
+
+
+# Core Contributors
+Fitsum Reda, Jinwei Gu, Xian Liu, Songwei Ge, Ting-Chun Wang, Haoxiang Wang, Ming-Yu Liu
\ No newline at end of file
diff --git a/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV8x8x8/autoencoder.jit b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV8x8x8/autoencoder.jit
new file mode 100644
index 0000000000000000000000000000000000000000..6cd404c2c06d7396433fd28cd35d608ffd261cad
--- /dev/null
+++ b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV8x8x8/autoencoder.jit
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ccf00856bc49c9e1c5ca6b01f47ef65d35bcdb37d58724641a3ea45751714724
+size 213071541
diff --git a/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV8x8x8/config.json b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV8x8x8/config.json
new file mode 100644
index 0000000000000000000000000000000000000000..bcad561de5279b772db7dd4b76b11d07ddc7ced1
--- /dev/null
+++ b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV8x8x8/config.json
@@ -0,0 +1,6 @@
+{
+ "architectures": [
+ "CosmosTokenizer"
+ ],
+}
+
\ No newline at end of file
diff --git a/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV8x8x8/decoder.jit b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV8x8x8/decoder.jit
new file mode 100644
index 0000000000000000000000000000000000000000..6afa147171abff1b954f902dd760b01b1cbfee70
--- /dev/null
+++ b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV8x8x8/decoder.jit
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:881f1f6317872fad3eeeaa1e595061aa3ee12590d14ce435ac9e9e5c883e797b
+size 126792092
diff --git a/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV8x8x8/encoder.jit b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV8x8x8/encoder.jit
new file mode 100644
index 0000000000000000000000000000000000000000..3758c66d1c873532097db33174c5c6e40130864f
--- /dev/null
+++ b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV8x8x8/encoder.jit
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a014af9f0bfae97a29a5bf071ca58ac29be40d4ffae12ef08a11004b92f0fb8d
+size 87042184
diff --git a/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV8x8x8/model_config.yaml b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV8x8x8/model_config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..5be0900a5551d62cb295248f76186f2f665c51d0
--- /dev/null
+++ b/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV8x8x8/model_config.yaml
@@ -0,0 +1 @@
+nemo_version: https://github.com/NVIDIA/NeMo/commit/6a5d4b5d19e05262a4182a83613753d424153a8f
\ No newline at end of file
diff --git a/Meissonic/pretrained_ckpts/Cosmos-1.0-Tokenizer-DV8x16x16/.gitattributes b/Meissonic/pretrained_ckpts/Cosmos-1.0-Tokenizer-DV8x16x16/.gitattributes
new file mode 100644
index 0000000000000000000000000000000000000000..14a69d3c1c50c140b42093ed67c383e51f98c237
--- /dev/null
+++ b/Meissonic/pretrained_ckpts/Cosmos-1.0-Tokenizer-DV8x16x16/.gitattributes
@@ -0,0 +1,38 @@
+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+autoencoder.jit filter=lfs diff=lfs merge=lfs -text
+decoder.jit filter=lfs diff=lfs merge=lfs -text
+encoder.jit filter=lfs diff=lfs merge=lfs -text
diff --git a/Meissonic/pretrained_ckpts/Cosmos-1.0-Tokenizer-DV8x16x16/README.md b/Meissonic/pretrained_ckpts/Cosmos-1.0-Tokenizer-DV8x16x16/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..dfebdabc2dfa12b157600490ef6e0264abcfd380
--- /dev/null
+++ b/Meissonic/pretrained_ckpts/Cosmos-1.0-Tokenizer-DV8x16x16/README.md
@@ -0,0 +1,396 @@
+---
+license: other
+license_name: nvidia-open-model-license
+license_link: >-
+ https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license
+library_name: cosmos
+tags:
+- nvidia
+- nemo
+- cosmos
+extra_gated_prompt: >-
+ # NVIDIA Open Model License Agreement
+
+ Version Release Date: January 6, 2025
+
+ This NVIDIA Open Model License Agreement (the "Agreement") is a legal agreement between the Legal Entity You represent, or if no entity is identified, You and NVIDIA Corporation and its Affiliates ("NVIDIA") and governs Your use of the Models that NVIDIA provides to You under this Agreement. NVIDIA and You are each a "party" and collectively the "parties."
+
+ NVIDIA models released under this Agreement are intended to be used permissively and enable the further development of AI technologies. Subject to the terms of this Agreement, NVIDIA confirms that:
+
+ * Models are commercially usable.
+
+ * You are free to create and distribute Derivative Models.
+
+ * NVIDIA does not claim ownership to any outputs generated using the Models or Model Derivatives.
+
+ By using, reproducing, modifying, distributing, performing or displaying any portion or element of the Model or Derivative Model, or otherwise accepting the terms of this Agreement, you agree to be bound by this Agreement.
+
+ ## 1. Definitions
+
+ The following definitions apply to this Agreement:
+
+ 1.1. "NVIDIA Cosmos Model" means a multimodal Model shared under this Agreement.
+
+ 1.2. "Derivative Model" means all (a) modifications to the Model, (b) works based on the Model, and (c) any other derivative works of the Model. An output is not a Derivative Model.
+
+ 1.3. "Legal Entity" means the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (a) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (b) ownership of fifty percent (50%) or more of the outstanding shares, or (c) beneficial ownership of such entity.
+
+ 1.4. "Model" means the machine learning model, software, checkpoints, learnt weights, algorithms, parameters, configuration files and documentation shared under this Agreement.
+
+ 1.5. "You" or "Your" means an individual or Legal Entity exercising permissions granted by this Agreement.
+
+ ## 2. Conditions for Use, License Grant, AI Ethics and IP Ownership
+
+ 2.1. Conditions for Use. The Model and any Derivative Model are subject to additional terms as described in Section 2 and Section 3 of this Agreement and govern Your use. If You institute copyright or patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Model or a Derivative Model constitutes direct or contributory copyright or patent infringement, then any licenses granted to You under this Agreement for that Model or Derivative Model will terminate as of the date such litigation is filed. If You bypass, disable, reduce the efficacy of, or circumvent any technical limitation, safety guardrail or associated safety guardrail hyperparameter, encryption, security, digital rights management, or authentication mechanism contained in the Model, your rights under this Agreement will automatically terminate. NVIDIA may update this Agreement to comply with legal and regulatory requirements at any time and You agree to either comply with any updated license or cease Your copying, use, and distribution of the Model and any Derivative Model.
+
+ 2.2. License Grant. The rights granted herein are explicitly conditioned on Your full compliance with the terms of this Agreement. Subject to the terms and conditions of this Agreement, NVIDIA hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, revocable (as stated in Section 2.1) license to publicly perform, publicly display, reproduce, use, create derivative works of, make, have made, sell, offer for sale, distribute (through multiple tiers of distribution) and import the Model.
+
+ 2.3. AI Ethics. Use of the Models under the Agreement must be consistent with NVIDIA's Trustworthy AI terms found at https://www.nvidia.com/en-us/agreements/trustworthy-ai/terms/.
+
+ 2.4. NVIDIA owns the Model and any Model Derivatives created by NVIDIA. Subject to NVIDIA's underlying ownership rights in the Model or its Model Derivatives, You are and will be the owner of Your Model Derivatives. NVIDIA claims no ownership rights in outputs. You are responsible for outputs and their subsequent uses. Except as expressly granted in this Agreement, (a) NVIDIA reserves all rights, interests and remedies in connection with the Model and (b) no other license or right is granted to you by implication, estoppel or otherwise.
+
+ ## 3. Redistribution
+
+ You may reproduce and distribute copies of the Model or Derivative Models thereof in any medium, with or without modifications, provided that You meet the following conditions:
+
+ 3.1. If you distribute the Model, You must give any other recipients of the Model a copy of this Agreement and include the following attribution notice within a "Notice" text file with such copies: "Licensed by NVIDIA Corporation under the NVIDIA Open Model License";
+
+ 3.2. If you distribute or make available a NVIDIA Cosmos Model, or a product or service (including an AI model) that contains or uses a NVIDIA Cosmos Model, use a NVIDIA Cosmos Model to create a Derivative Model, or use a NVIDIA Cosmos Model or its outputs to create, train, fine tune, or otherwise improve an AI model, you will include "Built on NVIDIA Cosmos" on a related website, user interface, blogpost, about page, or product documentation; and
+
+ 3.3. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Models as a whole, provided Your use, reproduction, and distribution of the Model otherwise complies with the conditions stated in this Agreement.
+
+ ## 4. Trademarks
+
+ This Agreement does not grant permission to use the trade names, trademarks, service marks, or product names of NVIDIA, except as required for reasonable and customary use in describing the origin of the Model and reproducing the content of the "Notice" text file.
+
+ ## **5. Disclaimer of Warranty**
+
+ **Unless required by applicable law or agreed to in writing, NVIDIA provides the Model on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Model, Derivative Models and outputs and assume any risks associated with Your exercise of permissions under this Agreement.**
+
+ ## **6. Limitation of Liability**
+
+ **In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, will NVIDIA be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this Agreement or out of the use or inability to use the Model, Derivative Models or outputs (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if NVIDIA has been advised of the possibility of such damages.**
+
+ ## 7. Indemnity
+
+ You will indemnify and hold harmless NVIDIA from and against any claim by any third party arising out of or related to your use or distribution of the Model, Model Derivatives or outputs.
+
+ ## 8. Feedback
+
+ NVIDIA appreciates your feedback, and You agree that NVIDIA may use it without restriction or compensation to You.
+
+ ## 9. Governing Law
+
+ This Agreement will be governed in all respects by the laws of the United States and the laws of the State of Delaware, without regard to conflict of laws principles or the United Nations Convention on Contracts for the International Sale of Goods. The state and federal courts residing in Santa Clara County, California will have exclusive jurisdiction over any dispute or claim arising out of or related to this Agreement, and the parties irrevocably consent to personal jurisdiction and venue in those courts; except that, either party may apply for injunctive remedies or an equivalent type of urgent legal relief in any jurisdiction.
+
+ ## 10. Trade and Compliance
+
+ You agree to comply with all applicable export, import, trade and economic sanctions laws and regulations, as amended, including without limitation U.S. Export Administration Regulations and Office of Foreign Assets Control regulations. These laws include restrictions on destinations, end-users and end-use.
+extra_gated_fields:
+ By clicking Submit below, I accept the terms of the NVIDIA Open Model License Agreement and acknowledge that I am an adult of legal age of majority in the country in which the Cosmos Models will be used and have authority to accept this Agreement: checkbox
+extra_gated_description: >-
+ The information you provide will be collected, stored, processed and shared in accordance with the [NVIDIA Privacy Policy](https://www.nvidia.com/en-us/about-nvidia/privacy-policy/).
+extra_gated_button_content: Submit
+---
+# **Cosmos Tokenizer**: A suite of image and video tokenizers
+
+[**Website**](https://research.nvidia.com/labs/dir/cosmos-tokenizer) | [**GitHub**](https://github.com/NVIDIA/Cosmos-Tokenizer) | [**NVIDIA News**](https://blogs.nvidia.com/blog/robot-learning-humanoid-development/) | [**NVIDIA Blog**](https://developer.nvidia.com/blog/state-of-the-art-multimodal-generative-ai-model-development-with-nvidia-nemo/) | [**Hugging Face**](https://huggingface.co/collections/nvidia/cosmos-tokenizer-672b93023add81b66a8ff8e6) | [**YouTube**](https://youtu.be/Soy_myOfWIU) | [**Paper**](https://arxiv.org/abs/2501.03575)
+
+# Model Overview
+
+## Description:
+**Cosmos Tokenizer** is a suite of visual tokenizers for images and videos that delivers various compression rates while maintaining high reconstruction quality. Cosmos Tokenizer can serve as an effective and efficient building block in both diffusion-based and autoregressive models for image and video generation. This model is ready for commercial use.
+
+
+Our tokenizers come in two types: **Continuous** (C) and **Discrete** (D), each with **Image** (I) and **Video** (V) variants:
+* Continuous tokenizers encode visual data into continuous latent embeddings, as shown in latent diffusion models like [Stable Diffusion](https://github.com/CompVis/stable-diffusion). These embeddings are suitable for models that generate data by sampling from continuous distributions.
+* Discrete tokenizers encode visual data into discrete latent codes, mapping them into quantized indices, as seen in autoregressive transformers such as [VideoPoet](https://sites.research.google/videopoet/). This discretization is required for models that generate data by optimizing the cross-entropy loss, such as the GPT models.
+
+
+| | Continuous ( C ) | Discrete ( D ) |
+| ------------------|---------------------|---------------------|
+| **Images ( I )** | Cosmos-Tokenizer-CI | Cosmos-Tokenizer-DI |
+| **Videos ( V )** | Cosmos-Tokenizer-CV | Cosmos-Tokenizer-DV |
+
+
+Given an image or a video, Cosmos Tokenizer outputs either continuous latents or discrete tokens. Cosmos Tokenizer achieves spatial compression rates of 8x8 or 16x16 and temporal compression factors of 4x or 8x, resulting in a total compression factor of up to 2048x (=8x16x16). Cosmos Tokenizer delivers 8x more total compression than state-of-the-art (SOTA) methods while simultaneously maintaining higher image quality and running up to 12x faster than the best available SOTA tokenizers.
+
+**Model Developer**: NVIDIA
+
+## Model Versions
+
+This release (v1.0) of Cosmos Tokenizer includes the following tokenizers:
+* **Continuous Tokenizers**
+ * [Cosmos-1.0-Tokenizer-CV8x8x8](https://huggingface.co/nvidia/Cosmos-1.0-Tokenizer-CV8x8x8) (8x temporal compression, 8x8 spatial compression, 720 short spatial size, 121 frames context)
+* **Discrete Tokenizers**
+ * [Cosmos-1.0-Tokenizer-DV8x16x16](https://huggingface.co/nvidia/Cosmos-1.0-Tokenizer-DV8x16x16) (8x temporal compression, 16x16 spatial compression, 720 short spatial size, 49 frames context)
+
+The previous release (v0.1) of Cosmos Tokenizer included the following tokenizers:
+* **Continuous Tokenizers**
+ * Continuous Image (CI) Tokenizer
+ * [Cosmos-0.1-Tokenizer-CI8x8](https://huggingface.co/nvidia/Cosmos-Tokenizer-CI8x8) (8x8 spatial compression)
+ * [Cosmos-0.1-Tokenizer-CI16x16](https://huggingface.co/nvidia/Cosmos-Tokenizer-CI16x16) (16x16 spatial compression)
+ * Continuous Video (CV) Tokenizer
+ * [Cosmos-0.1-Tokenizer-CV4x8x8](https://huggingface.co/nvidia/Cosmos-Tokenizer-CV4x8x8) (4x temporal compression, 8x8 spatial compression)
+ * [Cosmos-0.1-Tokenizer-CV8x8x8](https://huggingface.co/nvidia/Cosmos-Tokenizer-CV8x8x8) (8x temporal compression, 8x8 spatial compression)
+ * [Cosmos-0.1-Tokenizer-CV8x16x16](https://huggingface.co/nvidia/Cosmos-Tokenizer-CV8x16x16) (8x temporal compression, 16x16 spatial compression)
+* **Discrete Tokenizers**
+ * Discrete Image (DI) Tokenizer
+ * [Cosmos-0.1-Tokenizer-DI8x8](https://huggingface.co/nvidia/Cosmos-Tokenizer-DI8x8) (8x8 spatial compression)
+ * [Cosmos-0.1-Tokenizer-DI16x16](https://huggingface.co/nvidia/Cosmos-Tokenizer-DI16x16) (16x16 spatial compression)
+ * Discrete Video (DV) Tokenizer
+ * [Cosmos-0.1-Tokenizer-DV4x8x8](https://huggingface.co/nvidia/Cosmos-Tokenizer-DV4x8x8) (4x temporal compression, 8x8 spatial compression)
+ * [Cosmos-1.0-Tokenizer-DV8x8x8](https://huggingface.co/nvidia/Cosmos-Tokenizer-DV8x8x8) (8x temporal compression, 8x8 spatial compression)
+ * [Cosmos-1.0-Tokenizer-DV8x16x16](https://huggingface.co/nvidia/Cosmos-Tokenizer-DV8x16x16) (8x temporal compression, 16x16 spatial compression)
+
+
+
+### License:
+This model is released under the [NVIDIA Open Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license). For a custom license, please contact [cosmos-license@nvidia.com](mailto:cosmos-license@nvidia.com).
+
+Under the NVIDIA Open Model License, NVIDIA confirms:
+
+* Models are commercially usable.
+* You are free to create and distribute Derivative Models.
+* NVIDIA does not claim ownership to any outputs generated using the Models or Derivative Models.
+
+**Important Note**: If you bypass, disable, reduce the efficacy of, or circumvent any technical limitation, safety guardrail or
+associated safety guardrail hyperparameter, encryption, security, digital rights management, or authentication mechanism contained
+in the Model, your rights under [NVIDIA Open Model License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license) will automatically terminate.
+
+## Model Architecture:
+
+We designed Cosmos Tokenizer using a lightweight and computationally efficient architecture, featuring a temporally causal design. Specifically, we employ causal temporal convolution and causal temporal attention layers to preserve the natural temporal order of video frames, ensuring seamless tokenization of images and videos using a single unified network architecture. The encoder and decoder form a symmetrical pair, which are mirrors of each other. The encoder starts with a 2-level [Haar wavelet](https://link.springer.com/book/10.1007/978-3-319-04295-4) transform layer, which down-samples inputs by a factor of 4 in both spatial and temporal dimensions. Likewise, the decoder ends with an inverse wavelet transform. We employ the vanilla autoencoder (AE) formulation to model the latent space for continuous tokenizers. For discrete tokenizers, we adopt the [Finite-Scalar-Quantization](https://openreview.net/forum?id=8ishA3LxN8) (FSQ) as the latent space quantizer.
+
+
+
+
+
+## Input/Output Specifications
+
+### Encoder
+* **Input**
+ * **Type:** Images or Videos
+ * **Format:** RGB (Red, Green, Blue)
+ * **Properties:**
+ * **Resolution:** Minimum: 256px (shorter side). Maximum: Up to 4K
+ * **Video Length:** Up to 8 seconds for 1080p videos (bounded by A100 80G GPU memory; higher resolutions will have shorter supported durations)
+
+* **Output**
+ * **Type:** Tokens
+ * **Properties:**
+ * Integer indices ranging from 0 to 63,999
+
+### Decoder
+* **Input**
+ * **Type:** Tokens
+ * **Properties:**
+ * Integer indices ranging from 0 to 63,999
+
+* **Output**
+ * **Type** Images or Videos (matching input type)
+ * **Format:** RGB (Red, Green, Blue)
+ * **Properties:**
+ * **Resolution:** Same as input resolution
+ * **Video Length:** Same as input video length
+
+## Software Integration (Required For NVIDIA Models Only):
+**Runtime Engine(s):**
+* [Cosmos-Tokenizer](https://github.com/NVIDIA/Cosmos-Tokenizer)
+
+**Supported Hardware Microarchitecture Compatibility:**
+* NVIDIA Ampere (e.g., A100)
+* NVIDIA Hopper (e.g., H100)
+
+Note: We have only tested Cosmos Tokenizer with BF16 precision on Ampere and Hopper GPUs. If you are using older versions of NVIDIA GPUs (e.g., NVIDIA Volta GPUs), you may need to switch to FP32 precision.
+
+
+**Operating System(s):**
+* Linux (We have not tested on other operating systems.)
+
+# Usage
+Inference Engines:
+* [Cosmos-Tokenizer](https://github.com/NVIDIA/Cosmos-Tokenizer) (PyTorch)
+
+## Inference with `Cosmos-Tokenizer` (PyTorch)
+### Step-1: Installation of `Cosmos-Tokenizer`
+Note: Currently, the `Cosmos-Tokenizer` code is only supported on Linux.
+
+- Please clone the `Cosmos-Tokenizer` from GitHub repo [github.com/NVIDIA/Cosmos-Tokenizer](https://github.com/NVIDIA/Cosmos-Tokenizer).
+
+ ```bash
+ git clone https://github.com/NVIDIA/Cosmos-Tokenizer.git
+ cd Cosmos-Tokenizer
+ ```
+- Install dependencies
+
+ ```bash
+ pip3 install -r requirements.txt
+ apt-get install -y ffmpeg
+ ```
+
+- Preferably, you could build a docker image using our provided Dockerfile.
+ ```bash
+ docker build -t cosmos-docker -f Dockerfile.
+ # You can run the container as:
+ docker run --gpus all -it --rm -v /home/${USER}:/home/${USER} \
+ --workdir ${PWD} cosmos-docker /bin/bash
+ ```
+
+### Step-2: Download Pre-trained Checkpoints
+- Create a local directory for the pre-trained checkpoints and download the
+pre-trained checkpoints from HuggingFace.
+
+ ```python
+ from huggingface_hub import login, snapshot_download
+ import os
+ # You could get your Hugging Face token from https://huggingface.co/settings/tokens
+ login(token=, add_to_git_credential=True)
+ # You could specify the tokenizers you want to download.
+ model_names = [
+ "Cosmos-1.0-Tokenizer-DV8x16x16",
+ ]
+ for model_name in model_names:
+ hf_repo = "nvidia/" + model_name
+ local_dir = "pretrained_ckpts/" + model_name
+ os.makedirs(local_dir, exist_ok=True)
+ print(f"downloading {model_name} to {local_dir}...")
+ snapshot_download(repo_id=hf_repo, local_dir=local_dir)
+ ```
+
+- Under the ech checkpoint directory `pretrained_ckpts/`, we provide the encoder,
+decoder and the full autoencoder JIT models.
+
+ ```bash
+ ├── pretrained_ckpts/
+ │ ├── Cosmos-1.0-Tokenizer-DV8x16x16/
+ │ │ ├── encoder.jit
+ │ │ ├── decoder.jit
+ │ │ ├── autoencoder.jit
+ │ ...
+ ```
+
+### Step-3: Run Inference
+You can use the following example commands to encode and decode images or videos. For each, the same command works for both continuous and discrete tokenization. Simply provide the proper JIT-compiled ckpt to `checkpoint_enc`, `checkpoint_dec`, or the full autoencoder ckpt to `checkpoint`.
+
+```python
+import torch
+from cosmos_tokenizer.video_lib import CausalVideoTokenizer
+model_name = "Cosmos-1.0-Tokenizer-DV8x16x16"
+input_tensor = torch.randn(1, 3, 9, 512, 512).to('cuda').to(torch.bfloat16)
+encoder = CausalVideoTokenizer(checkpoint_enc=f'pretrained_ckpts/{model_name}/encoder.jit')
+(indices, codes) = encoder.encode(input_tensor)
+torch.testing.assert_close(indices.shape, (1, 2, 32, 32))
+torch.testing.assert_close(codes.shape, (1, 6, 2, 32, 32))
+
+# The input tensor can be reconstructed by the decoder as:
+decoder = CausalVideoTokenizer(checkpoint_dec=f'pretrained_ckpts/{model_name}/decoder.jit')
+reconstructed_tensor = decoder.decode(indices)
+torch.testing.assert_close(reconstructed_tensor.shape, input_tensor.shape)
+```
+
+The `indices` will have the shape `(1, 2, 32, 32)` and contain integral values in the range `[1..64K]`, where the first of the three integral maps represents the first frame.
+The `codes` will contain the pre-quantization continuous latent with shape `(1, 6, 2, 32, 32)`, where C=6 represents the number of FSQ levels.
+
+**Note**: More inference usage commands, including both TorchScript (JIT) and PyTorch Inference APIs on real images and videos, can be found on our GitHub repository [github.com/NVIDIA/Cosmos-Tokenizer](https://github.com/NVIDIA/Cosmos-Tokenizer).
+
+
+# Evaluation
+
+## TokenizationPerformance Comparison
+We have extensively evaluated the additional **Cosmos Tokenizer** models on [DAVIS](https://davischallenge.org/) video benchmark dataset.
+
+| Tokenizer | Compression Ratio | Height | Num. of Frames | Quantization | PSNR (DAVIS) | SSIM (DAVIS) | rFVD (DAVIS) |
+|-----------|------------------|--------------|--------------|--------------|--------------|--------------|--------------|
+| VideoGPT | 4×4×4 | - | - | VQ | 32.23 | **0.850** | 72.33 |
+| OmniTokenizer | 4×8×8 | - | - | VQ | 28.44 | 0.712 | 188.60 |
+| Cosmos-Tokenizer-DV | 4×8×8 | 720 | 17 | FSQ | **32.98** | 0.818 | **37.36** |
+| Cosmos-Tokenizer-DV | 8×8×8 | 720 | 17 | FSQ | 32.11 | 0.775 | 100.15 |
+| Cosmos-Tokenizer-DV | 8×16×16 | 720 | 17 | FSQ | 31.42 | 0.716 | 241.52 |
+| Cosmos-Tokenizer-DV | 8×16×16 | 720 | 49 | FSQ | 31.59 | 0.719 | 259.33 |
+
+
+* We compare with the state-of-the-art discrete video tokenizer, [OmniTokenizer](https://github.com/FoundationVision/OmniTokenizer).
+* Evaluation metrics:
+ * Peak Signal-to-Noise Ratio (PSNR)
+ * Structural Similarity (SSIM)
+ * Reconstruction Fréchet Video Distance (rFVD)
+
+## Runtime Comparison
+
+The following table shows the number of parameters and the averaged encoding and decoding times per image or video frame, measured on a single A100 80GB GPU. For comparison, we also list the parameters and average speeds of prior state-of-the-art tokenizer(s) with the same compression ratio.
+
+| Tokenizer | Resolution | Compression Ratio | Parameters | Time (ms) |
+|----------------|------------|-------------------|------------|-----------|
+| OmniTokenizer | 720x1280 | 4×8×8 | 54M | 53.2 |
+| Cosmos-DV | 720x1280 | 4×8×8 | 105M | 51.5 |
+
+Note: We benchmarked the runtime for images under the 8x8 compression and videos under the 4×8×8 compression. Tokenizers with different compression ratios are not included in this comparison.
+
+## Ethical Considerations
+NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
+
+### Plus Plus (++) Promise
+
+We value you, the datasets, the diversity they represent, and what we have been entrusted with. This model and its associated data have been:
+* Verified to comply with current applicable disclosure laws, regulations, and industry standards.
+* Verified to comply with applicable privacy labeling requirements.
+* Annotated to describe the collector/source (NVIDIA or a third-party).
+* Characterized for technical limitations.
+* Reviewed to ensure proper disclosure is accessible to, maintained for, and in compliance with NVIDIA data subjects and their requests.
+* Reviewed before release.
+* Tagged for known restrictions and potential safety implications.
+
+For more detailed information on ethical considerations for this model, please see the subcards of Explainability, Bias, Safety & Security, and Privacy below. Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).
+
+### Bias
+
+Field | Response
+:---------------------------------------------------------------------------------------------------|:---------------
+Participation considerations from adversely impacted groups [protected classes](https://www.senate.ca.gov/content/protected-classes) in model design and testing: | None
+Measures taken to mitigate against unwanted bias: | None
+
+
+### Explainability
+
+Field | Response
+:------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------
+Intended Application & Domain: | Tokenization of images and videos
+Model Type: | Auto-Encoder
+Intended Users: | Generative AI developers for image and video generation models
+Output: | Images/Videos and Latent Tokens
+Describe how the model works: | Compresses and decompresses visual input (image/video).
+Technical Limitations: | Due to tokenizer compression limitations, some visual information (such as small text and other structured fine details) may not be reconstructed accurately. The tokenizers may not produce as high of a reconstruction results for videos with low resolution, e.g. less than 320p.
+Verified to have met prescribed NVIDIA quality standards: | Yes
+Performance Metrics: | Peak Signal-to-Noise Ratio (PSNR), Structural Similarity (SSIM), Reconstruction Fréchet Video Distance (rFVD), Reconstruction Fréchet Inception Distance (rFID), Latency
+Potential Known Risks: | Tokenizer's output can parse all forms of input, including what may be considered toxic, offensive, or indecent.
+Licensing: | [NVIDIA Open Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license)
+
+
+### Privacy
+Field | Response
+:----------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------
+Generatable or reverse engineerable personal information? | No
+Protected class data used to create this model? | None Known
+Was consent obtained for any personal data used? | None Known
+How often is dataset reviewed? | Before Release
+Is a mechanism in place to honor data subject right of access or deletion of personal data? | Not Applicable
+If personal collected for the development of the model, was it collected directly by NVIDIA? | Not Applicable
+If personal collected for the development of the model by NVIDIA, do you maintain or have access to disclosures made to data subjects? | Not Applicable
+If personal collected for the development of this AI model, was it minimized to only what was required? | Not Applicable
+Is there provenance for all datasets used in training? | Yes
+Does data labeling (annotation, metadata) comply with privacy laws? | Yes
+Is data compliant with data subject requests for data correction or removal, if such a request was made? | Not Applicable
+
+### Safety
+
+Field | Response
+:---------------------------------------------------|:----------------------------------
+Model Application(s): | Tokenization of images and videos
+Describe the life critical impact (if present). | None Known
+Use Case Restrictions: | [NVIDIA Open Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license)
+Model and dataset restrictions: | The Principle of least privilege (PoLP) is applied limiting access for dataset generation and model development. Restrictions enforce dataset access during training, and dataset license constraints adhered to. Model checkpoints are made available on Hugging Face, and may become available on cloud providers' model catalog.
+
+
+# Core Contributors
+Fitsum Reda, Jinwei Gu, Xian Liu, Songwei Ge, Ting-Chun Wang, Haoxiang Wang, Ming-Yu Liu
\ No newline at end of file
diff --git a/Meissonic/pretrained_ckpts/Cosmos-1.0-Tokenizer-DV8x16x16/autoencoder.jit b/Meissonic/pretrained_ckpts/Cosmos-1.0-Tokenizer-DV8x16x16/autoencoder.jit
new file mode 100644
index 0000000000000000000000000000000000000000..7ca041131d5553bded0ca8122cb0821ea70ae0b0
--- /dev/null
+++ b/Meissonic/pretrained_ckpts/Cosmos-1.0-Tokenizer-DV8x16x16/autoencoder.jit
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d6871316ea4e6bd14a5f82c87c48ba0bcd853496830034a38701bc0ccd501c89
+size 223576773
diff --git a/Meissonic/pretrained_ckpts/Cosmos-1.0-Tokenizer-DV8x16x16/config.json b/Meissonic/pretrained_ckpts/Cosmos-1.0-Tokenizer-DV8x16x16/config.json
new file mode 100644
index 0000000000000000000000000000000000000000..bcad561de5279b772db7dd4b76b11d07ddc7ced1
--- /dev/null
+++ b/Meissonic/pretrained_ckpts/Cosmos-1.0-Tokenizer-DV8x16x16/config.json
@@ -0,0 +1,6 @@
+{
+ "architectures": [
+ "CosmosTokenizer"
+ ],
+}
+
\ No newline at end of file
diff --git a/Meissonic/pretrained_ckpts/Cosmos-1.0-Tokenizer-DV8x16x16/decoder.jit b/Meissonic/pretrained_ckpts/Cosmos-1.0-Tokenizer-DV8x16x16/decoder.jit
new file mode 100644
index 0000000000000000000000000000000000000000..59eed113cb2aeeebba6f9c6603bddb812aa4cf4c
--- /dev/null
+++ b/Meissonic/pretrained_ckpts/Cosmos-1.0-Tokenizer-DV8x16x16/decoder.jit
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:40add3d4d7c00e0d4ee30fcc69c2171fa30cb045b70b7e5f979ad66419e8dcd9
+size 132042180
diff --git a/Meissonic/pretrained_ckpts/Cosmos-1.0-Tokenizer-DV8x16x16/encoder.jit b/Meissonic/pretrained_ckpts/Cosmos-1.0-Tokenizer-DV8x16x16/encoder.jit
new file mode 100644
index 0000000000000000000000000000000000000000..2cf3df91b8de7a5d1eeab660bdbe68cea1f7f1f6
--- /dev/null
+++ b/Meissonic/pretrained_ckpts/Cosmos-1.0-Tokenizer-DV8x16x16/encoder.jit
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:51c760a14262cbc5db93c09bcb4710fa178e0e27928e4a75170f07db7225d4a8
+size 92292848
diff --git a/Meissonic/pretrained_ckpts/Cosmos-1.0-Tokenizer-DV8x16x16/model_config.yaml b/Meissonic/pretrained_ckpts/Cosmos-1.0-Tokenizer-DV8x16x16/model_config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..5be0900a5551d62cb295248f76186f2f665c51d0
--- /dev/null
+++ b/Meissonic/pretrained_ckpts/Cosmos-1.0-Tokenizer-DV8x16x16/model_config.yaml
@@ -0,0 +1 @@
+nemo_version: https://github.com/NVIDIA/NeMo/commit/6a5d4b5d19e05262a4182a83613753d424153a8f
\ No newline at end of file
diff --git a/Meissonic/requirements.txt b/Meissonic/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..30e0da1834eb9c1d95d70c9f269f3d5225443582
--- /dev/null
+++ b/Meissonic/requirements.txt
@@ -0,0 +1,21 @@
+--extra-index-url https://download.pytorch.org/whl/cu124
+accelerate
+pytorch-lightning
+torch
+torchvision
+tqdm
+transformers
+numpy
+gradio
+diffusers
+bitsandbytes
+open_clip_torch
+datasets
+peft
+pillow
+wandb
+dask
+pyarrow
+huggingface_hub
+peft
+sentencepiece
\ No newline at end of file
diff --git a/Meissonic/src/attention.py b/Meissonic/src/attention.py
new file mode 100644
index 0000000000000000000000000000000000000000..4dbbe03fc79e1eb1509dfd98720b60196144878d
--- /dev/null
+++ b/Meissonic/src/attention.py
@@ -0,0 +1,179 @@
+# Copyright 2024-2025 The Alibaba Wan Team Authors. All rights reserved.
+import torch
+
+try:
+ import flash_attn_interface
+ FLASH_ATTN_3_AVAILABLE = True
+except ModuleNotFoundError:
+ FLASH_ATTN_3_AVAILABLE = False
+
+try:
+ import flash_attn
+ FLASH_ATTN_2_AVAILABLE = True
+except ModuleNotFoundError:
+ FLASH_ATTN_2_AVAILABLE = False
+
+import warnings
+
+__all__ = [
+ 'flash_attention',
+ 'attention',
+]
+
+
+def flash_attention(
+ q,
+ k,
+ v,
+ q_lens=None,
+ k_lens=None,
+ dropout_p=0.,
+ softmax_scale=None,
+ q_scale=None,
+ causal=False,
+ window_size=(-1, -1),
+ deterministic=False,
+ dtype=torch.bfloat16,
+ version=None,
+):
+ """
+ q: [B, Lq, Nq, C1].
+ k: [B, Lk, Nk, C1].
+ v: [B, Lk, Nk, C2]. Nq must be divisible by Nk.
+ q_lens: [B].
+ k_lens: [B].
+ dropout_p: float. Dropout probability.
+ softmax_scale: float. The scaling of QK^T before applying softmax.
+ causal: bool. Whether to apply causal attention mask.
+ window_size: (left right). If not (-1, -1), apply sliding window local attention.
+ deterministic: bool. If True, slightly slower and uses more memory.
+ dtype: torch.dtype. Apply when dtype of q/k/v is not float16/bfloat16.
+ """
+ half_dtypes = (torch.float16, torch.bfloat16)
+ assert dtype in half_dtypes
+ assert q.device.type == 'cuda' and q.size(-1) <= 256
+
+ # params
+ b, lq, lk, out_dtype = q.size(0), q.size(1), k.size(1), q.dtype
+
+ def half(x):
+ return x if x.dtype in half_dtypes else x.to(dtype)
+
+ # preprocess query
+ if q_lens is None:
+ q = half(q.flatten(0, 1))
+ q_lens = torch.tensor(
+ [lq] * b, dtype=torch.int32).to(
+ device=q.device, non_blocking=True)
+ else:
+ q = half(torch.cat([u[:v] for u, v in zip(q, q_lens)]))
+
+ # preprocess key, value
+ if k_lens is None:
+ k = half(k.flatten(0, 1))
+ v = half(v.flatten(0, 1))
+ k_lens = torch.tensor(
+ [lk] * b, dtype=torch.int32).to(
+ device=k.device, non_blocking=True)
+ else:
+ k = half(torch.cat([u[:v] for u, v in zip(k, k_lens)]))
+ v = half(torch.cat([u[:v] for u, v in zip(v, k_lens)]))
+
+ q = q.to(v.dtype)
+ k = k.to(v.dtype)
+
+ if q_scale is not None:
+ q = q * q_scale
+
+ if version is not None and version == 3 and not FLASH_ATTN_3_AVAILABLE:
+ warnings.warn(
+ 'Flash attention 3 is not available, use flash attention 2 instead.'
+ )
+
+ # apply attention
+ if (version is None or version == 3) and FLASH_ATTN_3_AVAILABLE:
+ # Note: dropout_p, window_size are not supported in FA3 now.
+ x = flash_attn_interface.flash_attn_varlen_func(
+ q=q,
+ k=k,
+ v=v,
+ cu_seqlens_q=torch.cat([q_lens.new_zeros([1]), q_lens]).cumsum(
+ 0, dtype=torch.int32).to(q.device, non_blocking=True),
+ cu_seqlens_k=torch.cat([k_lens.new_zeros([1]), k_lens]).cumsum(
+ 0, dtype=torch.int32).to(q.device, non_blocking=True),
+ seqused_q=None,
+ seqused_k=None,
+ max_seqlen_q=lq,
+ max_seqlen_k=lk,
+ softmax_scale=softmax_scale,
+ causal=causal,
+ deterministic=deterministic)[0].unflatten(0, (b, lq))
+ else:
+ assert FLASH_ATTN_2_AVAILABLE
+ x = flash_attn.flash_attn_varlen_func(
+ q=q,
+ k=k,
+ v=v,
+ cu_seqlens_q=torch.cat([q_lens.new_zeros([1]), q_lens]).cumsum(
+ 0, dtype=torch.int32).to(q.device, non_blocking=True),
+ cu_seqlens_k=torch.cat([k_lens.new_zeros([1]), k_lens]).cumsum(
+ 0, dtype=torch.int32).to(q.device, non_blocking=True),
+ max_seqlen_q=lq,
+ max_seqlen_k=lk,
+ dropout_p=dropout_p,
+ softmax_scale=softmax_scale,
+ causal=causal,
+ window_size=window_size,
+ deterministic=deterministic).unflatten(0, (b, lq))
+
+ # output
+ return x.type(out_dtype)
+
+
+def attention(
+ q,
+ k,
+ v,
+ q_lens=None,
+ k_lens=None,
+ dropout_p=0.,
+ softmax_scale=None,
+ q_scale=None,
+ causal=False,
+ window_size=(-1, -1),
+ deterministic=False,
+ dtype=torch.bfloat16,
+ fa_version=None,
+):
+ if FLASH_ATTN_2_AVAILABLE or FLASH_ATTN_3_AVAILABLE:
+ return flash_attention(
+ q=q,
+ k=k,
+ v=v,
+ q_lens=q_lens,
+ k_lens=k_lens,
+ dropout_p=dropout_p,
+ softmax_scale=softmax_scale,
+ q_scale=q_scale,
+ causal=causal,
+ window_size=window_size,
+ deterministic=deterministic,
+ dtype=dtype,
+ version=fa_version,
+ )
+ else:
+ if q_lens is not None or k_lens is not None:
+ warnings.warn(
+ 'Padding mask is disabled when using scaled_dot_product_attention. It can have a significant impact on performance.'
+ )
+ attn_mask = None
+
+ q = q.transpose(1, 2).to(dtype)
+ k = k.transpose(1, 2).to(dtype)
+ v = v.transpose(1, 2).to(dtype)
+
+ out = torch.nn.functional.scaled_dot_product_attention(
+ q, k, v, attn_mask=attn_mask, is_causal=causal, dropout_p=dropout_p)
+
+ out = out.transpose(1, 2).contiguous()
+ return out
diff --git a/Meissonic/src/pipeline.py b/Meissonic/src/pipeline.py
new file mode 100644
index 0000000000000000000000000000000000000000..f82e41130368e7232d8d164e5560355993cc711f
--- /dev/null
+++ b/Meissonic/src/pipeline.py
@@ -0,0 +1,370 @@
+# Copyright 2024 The HuggingFace Team and The MeissonFlow Team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import sys
+from typing import Any, Callable, Dict, List, Optional, Tuple, Union
+import torch
+from transformers import CLIPTextModelWithProjection, CLIPTokenizer
+from diffusers.image_processor import VaeImageProcessor
+from diffusers.models import VQModel
+from diffusers.utils import replace_example_docstring
+from diffusers.pipelines.pipeline_utils import DiffusionPipeline, ImagePipelineOutput
+
+from src.scheduler import Scheduler
+from src.transformer import Transformer2DModel
+
+
+EXAMPLE_DOC_STRING = """
+ Examples:
+ ```py
+ >>> image = pipe(prompt).images[0]
+ ```
+"""
+
+
+def _prepare_latent_image_ids(batch_size, height, width, device, dtype):
+ latent_image_ids = torch.zeros(height // 2, width // 2, 3)
+ latent_image_ids[..., 1] = latent_image_ids[..., 1] + torch.arange(height // 2)[:, None]
+ latent_image_ids[..., 2] = latent_image_ids[..., 2] + torch.arange(width // 2)[None, :]
+
+ latent_image_id_height, latent_image_id_width, latent_image_id_channels = latent_image_ids.shape
+
+ latent_image_ids = latent_image_ids.reshape(
+ latent_image_id_height * latent_image_id_width, latent_image_id_channels
+ )
+
+ return latent_image_ids.to(device=device, dtype=dtype)
+
+
+class Pipeline(DiffusionPipeline):
+ image_processor: VaeImageProcessor
+ vqvae: VQModel
+ tokenizer: CLIPTokenizer
+ text_encoder: CLIPTextModelWithProjection
+ transformer: Transformer2DModel
+ scheduler: Scheduler
+ # tokenizer_t5: T5Tokenizer
+ # text_encoder_t5: T5ForConditionalGeneration
+
+ model_cpu_offload_seq = "text_encoder->transformer->vqvae"
+
+ def __init__(
+ self,
+ vqvae: VQModel,
+ tokenizer: CLIPTokenizer,
+ text_encoder: CLIPTextModelWithProjection,
+ transformer: Transformer2DModel,
+ scheduler: Scheduler,
+ # tokenizer_t5: T5Tokenizer,
+ # text_encoder_t5: T5ForConditionalGeneration,
+ ):
+ super().__init__()
+
+ self.register_modules(
+ vqvae=vqvae,
+ tokenizer=tokenizer,
+ text_encoder=text_encoder,
+ transformer=transformer,
+ scheduler=scheduler,
+ # tokenizer_t5=tokenizer_t5,
+ # text_encoder_t5=text_encoder_t5,
+ )
+ self.vae_scale_factor = 2 ** (len(self.vqvae.config.block_out_channels) - 1)
+ self.image_processor = VaeImageProcessor(vae_scale_factor=self.vae_scale_factor, do_normalize=False)
+
+ @torch.no_grad()
+ @replace_example_docstring(EXAMPLE_DOC_STRING)
+ def __call__(
+ self,
+ prompt: Optional[Union[List[str], str]] = None,
+ height: Optional[int] = 1024,
+ width: Optional[int] = 1024,
+ num_inference_steps: int = 48,
+ guidance_scale: float = 9.0,
+ negative_prompt: Optional[Union[str, List[str]]] = None,
+ num_images_per_prompt: Optional[int] = 1,
+ generator: Optional[torch.Generator] = None,
+ latents: Optional[torch.IntTensor] = None,
+ prompt_embeds: Optional[torch.Tensor] = None,
+ encoder_hidden_states: Optional[torch.Tensor] = None,
+ negative_prompt_embeds: Optional[torch.Tensor] = None,
+ negative_encoder_hidden_states: Optional[torch.Tensor] = None,
+ output_type="pil",
+ return_dict: bool = True,
+ callback: Optional[Callable[[int, int, torch.Tensor], None]] = None,
+ callback_steps: int = 1,
+ cross_attention_kwargs: Optional[Dict[str, Any]] = None,
+ micro_conditioning_aesthetic_score: int = 6,
+ micro_conditioning_crop_coord: Tuple[int, int] = (0, 0),
+ temperature: Union[int, Tuple[int, int], List[int]] = (2, 0),
+ ):
+ """
+ The call function to the pipeline for generation.
+
+ Args:
+ prompt (`str` or `List[str]`, *optional*):
+ The prompt or prompts to guide image generation. If not defined, you need to pass `prompt_embeds`.
+ height (`int`, *optional*, defaults to `self.transformer.config.sample_size * self.vae_scale_factor`):
+ The height in pixels of the generated image.
+ width (`int`, *optional*, defaults to `self.unet.config.sample_size * self.vae_scale_factor`):
+ The width in pixels of the generated image.
+ num_inference_steps (`int`, *optional*, defaults to 16):
+ The number of denoising steps. More denoising steps usually lead to a higher quality image at the
+ expense of slower inference.
+ guidance_scale (`float`, *optional*, defaults to 10.0):
+ A higher guidance scale value encourages the model to generate images closely linked to the text
+ `prompt` at the expense of lower image quality. Guidance scale is enabled when `guidance_scale > 1`.
+ negative_prompt (`str` or `List[str]`, *optional*):
+ The prompt or prompts to guide what to not include in image generation. If not defined, you need to
+ pass `negative_prompt_embeds` instead. Ignored when not using guidance (`guidance_scale < 1`).
+ num_images_per_prompt (`int`, *optional*, defaults to 1):
+ The number of images to generate per prompt.
+ generator (`torch.Generator`, *optional*):
+ A [`torch.Generator`](https://pytorch.org/docs/stable/generated/torch.Generator.html) to make
+ generation deterministic.
+ latents (`torch.IntTensor`, *optional*):
+ Pre-generated tokens representing latent vectors in `self.vqvae`, to be used as inputs for image
+ gneration. If not provided, the starting latents will be completely masked.
+ prompt_embeds (`torch.Tensor`, *optional*):
+ Pre-generated text embeddings. Can be used to easily tweak text inputs (prompt weighting). If not
+ provided, text embeddings are generated from the `prompt` input argument. A single vector from the
+ pooled and projected final hidden states.
+ encoder_hidden_states (`torch.Tensor`, *optional*):
+ Pre-generated penultimate hidden states from the text encoder providing additional text conditioning.
+ negative_prompt_embeds (`torch.Tensor`, *optional*):
+ Pre-generated negative text embeddings. Can be used to easily tweak text inputs (prompt weighting). If
+ not provided, `negative_prompt_embeds` are generated from the `negative_prompt` input argument.
+ negative_encoder_hidden_states (`torch.Tensor`, *optional*):
+ Analogous to `encoder_hidden_states` for the positive prompt.
+ output_type (`str`, *optional*, defaults to `"pil"`):
+ The output format of the generated image. Choose between `PIL.Image` or `np.array`.
+ return_dict (`bool`, *optional*, defaults to `True`):
+ Whether or not to return a [`~pipelines.stable_diffusion.StableDiffusionPipelineOutput`] instead of a
+ plain tuple.
+ callback (`Callable`, *optional*):
+ A function that calls every `callback_steps` steps during inference. The function is called with the
+ following arguments: `callback(step: int, timestep: int, latents: torch.Tensor)`.
+ callback_steps (`int`, *optional*, defaults to 1):
+ The frequency at which the `callback` function is called. If not specified, the callback is called at
+ every step.
+ cross_attention_kwargs (`dict`, *optional*):
+ A kwargs dictionary that if specified is passed along to the [`AttentionProcessor`] as defined in
+ [`self.processor`](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/attention_processor.py).
+ micro_conditioning_aesthetic_score (`int`, *optional*, defaults to 6):
+ The targeted aesthetic score according to the laion aesthetic classifier. See
+ https://laion.ai/blog/laion-aesthetics/ and the micro-conditioning section of
+ https://arxiv.org/abs/2307.01952.
+ micro_conditioning_crop_coord (`Tuple[int]`, *optional*, defaults to (0, 0)):
+ The targeted height, width crop coordinates. See the micro-conditioning section of
+ https://arxiv.org/abs/2307.01952.
+ temperature (`Union[int, Tuple[int, int], List[int]]`, *optional*, defaults to (2, 0)):
+ Configures the temperature scheduler on `self.scheduler` see `Scheduler#set_timesteps`.
+
+ Examples:
+
+ Returns:
+ [`~pipelines.pipeline_utils.ImagePipelineOutput`] or `tuple`:
+ If `return_dict` is `True`, [`~pipelines.pipeline_utils.ImagePipelineOutput`] is returned, otherwise a
+ `tuple` is returned where the first element is a list with the generated images.
+ """
+ if (prompt_embeds is not None and encoder_hidden_states is None) or (
+ prompt_embeds is None and encoder_hidden_states is not None
+ ):
+ raise ValueError("pass either both `prompt_embeds` and `encoder_hidden_states` or neither")
+
+ if (negative_prompt_embeds is not None and negative_encoder_hidden_states is None) or (
+ negative_prompt_embeds is None and negative_encoder_hidden_states is not None
+ ):
+ raise ValueError(
+ "pass either both `negatve_prompt_embeds` and `negative_encoder_hidden_states` or neither"
+ )
+
+ if (prompt is None and prompt_embeds is None) or (prompt is not None and prompt_embeds is not None):
+ raise ValueError("pass only one of `prompt` or `prompt_embeds`")
+
+ if isinstance(prompt, str):
+ prompt = [prompt]
+
+ if prompt is not None:
+ batch_size = len(prompt)
+ else:
+ batch_size = prompt_embeds.shape[0]
+
+ batch_size = batch_size * num_images_per_prompt
+
+ if height is None:
+ height = self.transformer.config.sample_size * self.vae_scale_factor
+
+ if width is None:
+ width = self.transformer.config.sample_size * self.vae_scale_factor
+
+ if prompt_embeds is None:
+ input_ids = self.tokenizer(
+ prompt,
+ return_tensors="pt",
+ padding="max_length",
+ truncation=True,
+ max_length=77, #self.tokenizer.model_max_length,
+ ).input_ids.to(self._execution_device)
+ # input_ids_t5 = self.tokenizer_t5(
+ # prompt,
+ # return_tensors="pt",
+ # padding="max_length",
+ # truncation=True,
+ # max_length=512,
+ # ).input_ids.to(self._execution_device)
+
+
+ outputs = self.text_encoder(input_ids, return_dict=True, output_hidden_states=True)
+ # outputs_t5 = self.text_encoder_t5(input_ids_t5, decoder_input_ids = input_ids_t5 ,return_dict=True, output_hidden_states=True)
+ prompt_embeds = outputs.text_embeds
+ encoder_hidden_states = outputs.hidden_states[-2]
+ # encoder_hidden_states = outputs_t5.encoder_hidden_states[-2]
+
+ prompt_embeds = prompt_embeds.repeat(num_images_per_prompt, 1)
+ encoder_hidden_states = encoder_hidden_states.repeat(num_images_per_prompt, 1, 1)
+
+ if guidance_scale > 1.0:
+ if negative_prompt_embeds is None:
+ if negative_prompt is None:
+ negative_prompt = [""] * len(prompt)
+
+ if isinstance(negative_prompt, str):
+ negative_prompt = [negative_prompt]
+
+ input_ids = self.tokenizer(
+ negative_prompt,
+ return_tensors="pt",
+ padding="max_length",
+ truncation=True,
+ max_length=77, #self.tokenizer.model_max_length,
+ ).input_ids.to(self._execution_device)
+ # input_ids_t5 = self.tokenizer_t5(
+ # prompt,
+ # return_tensors="pt",
+ # padding="max_length",
+ # truncation=True,
+ # max_length=512,
+ # ).input_ids.to(self._execution_device)
+
+ outputs = self.text_encoder(input_ids, return_dict=True, output_hidden_states=True)
+ # outputs_t5 = self.text_encoder_t5(input_ids_t5, decoder_input_ids = input_ids_t5 ,return_dict=True, output_hidden_states=True)
+ negative_prompt_embeds = outputs.text_embeds
+ negative_encoder_hidden_states = outputs.hidden_states[-2]
+ # negative_encoder_hidden_states = outputs_t5.encoder_hidden_states[-2]
+
+
+
+ negative_prompt_embeds = negative_prompt_embeds.repeat(num_images_per_prompt, 1)
+ negative_encoder_hidden_states = negative_encoder_hidden_states.repeat(num_images_per_prompt, 1, 1)
+
+ prompt_embeds = torch.concat([negative_prompt_embeds, prompt_embeds])
+ encoder_hidden_states = torch.concat([negative_encoder_hidden_states, encoder_hidden_states])
+
+ # Note that the micro conditionings _do_ flip the order of width, height for the original size
+ # and the crop coordinates. This is how it was done in the original code base
+ micro_conds = torch.tensor(
+ [
+ width,
+ height,
+ micro_conditioning_crop_coord[0],
+ micro_conditioning_crop_coord[1],
+ micro_conditioning_aesthetic_score,
+ ],
+ device=self._execution_device,
+ dtype=encoder_hidden_states.dtype,
+ )
+ micro_conds = micro_conds.unsqueeze(0)
+ micro_conds = micro_conds.expand(2 * batch_size if guidance_scale > 1.0 else batch_size, -1)
+
+ shape = (batch_size, height // self.vae_scale_factor, width // self.vae_scale_factor)
+
+ if latents is None:
+ latents = torch.full(
+ shape, self.scheduler.config.mask_token_id, dtype=torch.long, device=self._execution_device
+ )
+
+ self.scheduler.set_timesteps(num_inference_steps, temperature, self._execution_device)
+
+ num_warmup_steps = len(self.scheduler.timesteps) - num_inference_steps * self.scheduler.order
+ with self.progress_bar(total=num_inference_steps) as progress_bar:
+ for i, timestep in enumerate(self.scheduler.timesteps):
+ if guidance_scale > 1.0:
+ model_input = torch.cat([latents] * 2)
+ else:
+ model_input = latents
+ if height == 1024: #args.resolution == 1024:
+ img_ids = _prepare_latent_image_ids(model_input.shape[0], model_input.shape[-2],model_input.shape[-1],model_input.device,model_input.dtype)
+ else:
+ img_ids = _prepare_latent_image_ids(model_input.shape[0],2*model_input.shape[-2],2*model_input.shape[-1],model_input.device,model_input.dtype)
+ txt_ids = torch.zeros(encoder_hidden_states.shape[1],3).to(device = encoder_hidden_states.device, dtype = encoder_hidden_states.dtype)
+ model_output = self.transformer(
+ hidden_states = model_input,
+ micro_conds=micro_conds,
+ pooled_projections=prompt_embeds,
+ encoder_hidden_states=encoder_hidden_states,
+ img_ids = img_ids,
+ txt_ids = txt_ids,
+ timestep = torch.tensor([timestep], device=model_input.device, dtype=torch.long),
+ # guidance = 7,
+ # cross_attention_kwargs=cross_attention_kwargs,
+ )
+
+ if guidance_scale > 1.0:
+ uncond_logits, cond_logits = model_output.chunk(2)
+ model_output = uncond_logits + guidance_scale * (cond_logits - uncond_logits)
+
+ latents = self.scheduler.step(
+ model_output=model_output,
+ timestep=timestep,
+ sample=latents,
+ generator=generator,
+ ).prev_sample
+
+ if i == len(self.scheduler.timesteps) - 1 or (
+ (i + 1) > num_warmup_steps and (i + 1) % self.scheduler.order == 0
+ ):
+ progress_bar.update()
+ if callback is not None and i % callback_steps == 0:
+ step_idx = i // getattr(self.scheduler, "order", 1)
+ callback(step_idx, timestep, latents)
+
+ if output_type == "latent":
+ output = latents
+ else:
+ needs_upcasting = self.vqvae.dtype == torch.float16 and self.vqvae.config.force_upcast
+
+ if needs_upcasting:
+ self.vqvae.float()
+
+ output = self.vqvae.decode(
+ latents,
+ force_not_quantize=True,
+ shape=(
+ batch_size,
+ height // self.vae_scale_factor,
+ width // self.vae_scale_factor,
+ self.vqvae.config.latent_channels,
+ ),
+ ).sample.clip(0, 1)
+ output = self.image_processor.postprocess(output, output_type)
+
+ if needs_upcasting:
+ self.vqvae.half()
+
+ self.maybe_free_model_hooks()
+
+ if not return_dict:
+ return (output,)
+
+ return ImagePipelineOutput(output)
\ No newline at end of file
diff --git a/Meissonic/src/pipeline_img2img.py b/Meissonic/src/pipeline_img2img.py
new file mode 100644
index 0000000000000000000000000000000000000000..acb67d6eb6d376fd22719f9401ff49d4a9354196
--- /dev/null
+++ b/Meissonic/src/pipeline_img2img.py
@@ -0,0 +1,337 @@
+# Copyright 2024 The HuggingFace Team and The MeissonFlow Team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from typing import Any, Callable, Dict, List, Optional, Tuple, Union
+import torch
+from transformers import CLIPTextModelWithProjection, CLIPTokenizer
+from diffusers.image_processor import PipelineImageInput, VaeImageProcessor
+from diffusers.models import UVit2DModel, VQModel
+from diffusers.utils import replace_example_docstring
+from diffusers.pipelines.pipeline_utils import DiffusionPipeline, ImagePipelineOutput
+
+from src.scheduler import Scheduler
+from src.transformer import Transformer2DModel
+from src.pipeline import _prepare_latent_image_ids
+
+EXAMPLE_DOC_STRING = """
+ Examples:
+ ```py
+ >>> image = pipe(prompt, input_image).images[0]
+ ```
+"""
+
+class Img2ImgPipeline(DiffusionPipeline):
+ image_processor: VaeImageProcessor
+ vqvae: VQModel
+ tokenizer: CLIPTokenizer
+ text_encoder: CLIPTextModelWithProjection
+ transformer: Transformer2DModel #UVit2DModel
+ scheduler: Scheduler
+
+ model_cpu_offload_seq = "text_encoder->transformer->vqvae"
+
+ # TODO - when calling self.vqvae.quantize, it uses self.vqvae.quantize.embedding.weight before
+ # the forward method of self.vqvae.quantize, so the hook doesn't get called to move the parameter
+ # off the meta device. There should be a way to fix this instead of just not offloading it
+ _exclude_from_cpu_offload = ["vqvae"]
+
+ def __init__(
+ self,
+ vqvae: VQModel,
+ tokenizer: CLIPTokenizer,
+ text_encoder: CLIPTextModelWithProjection,
+ transformer: Transformer2DModel, #UVit2DModel,
+ scheduler: Scheduler,
+ ):
+ super().__init__()
+
+ self.register_modules(
+ vqvae=vqvae,
+ tokenizer=tokenizer,
+ text_encoder=text_encoder,
+ transformer=transformer,
+ scheduler=scheduler,
+ )
+ self.vae_scale_factor = 2 ** (len(self.vqvae.config.block_out_channels) - 1)
+ self.image_processor = VaeImageProcessor(vae_scale_factor=self.vae_scale_factor, do_normalize=False)
+
+ @torch.no_grad()
+ @replace_example_docstring(EXAMPLE_DOC_STRING)
+ def __call__(
+ self,
+ prompt: Optional[Union[List[str], str]] = None,
+ image: PipelineImageInput = None,
+ strength: float = 0.5,
+ num_inference_steps: int = 12,
+ guidance_scale: float = 10.0,
+ negative_prompt: Optional[Union[str, List[str]]] = None,
+ num_images_per_prompt: Optional[int] = 1,
+ generator: Optional[torch.Generator] = None,
+ prompt_embeds: Optional[torch.Tensor] = None,
+ encoder_hidden_states: Optional[torch.Tensor] = None,
+ negative_prompt_embeds: Optional[torch.Tensor] = None,
+ negative_encoder_hidden_states: Optional[torch.Tensor] = None,
+ output_type="pil",
+ return_dict: bool = True,
+ callback: Optional[Callable[[int, int, torch.Tensor], None]] = None,
+ callback_steps: int = 1,
+ cross_attention_kwargs: Optional[Dict[str, Any]] = None,
+ micro_conditioning_aesthetic_score: int = 6,
+ micro_conditioning_crop_coord: Tuple[int, int] = (0, 0),
+ temperature: Union[int, Tuple[int, int], List[int]] = (2, 0),
+ ):
+ """
+ The call function to the pipeline for generation.
+
+ Args:
+ prompt (`str` or `List[str]`, *optional*):
+ The prompt or prompts to guide image generation. If not defined, you need to pass `prompt_embeds`.
+ image (`torch.Tensor`, `PIL.Image.Image`, `np.ndarray`, `List[torch.Tensor]`, `List[PIL.Image.Image]`, or `List[np.ndarray]`):
+ `Image`, numpy array or tensor representing an image batch to be used as the starting point. For both
+ numpy array and pytorch tensor, the expected value range is between `[0, 1]` If it's a tensor or a list
+ or tensors, the expected shape should be `(B, C, H, W)` or `(C, H, W)`. If it is a numpy array or a
+ list of arrays, the expected shape should be `(B, H, W, C)` or `(H, W, C)` It can also accept image
+ latents as `image`, but if passing latents directly it is not encoded again.
+ strength (`float`, *optional*, defaults to 0.5):
+ Indicates extent to transform the reference `image`. Must be between 0 and 1. `image` is used as a
+ starting point and more noise is added the higher the `strength`. The number of denoising steps depends
+ on the amount of noise initially added. When `strength` is 1, added noise is maximum and the denoising
+ process runs for the full number of iterations specified in `num_inference_steps`. A value of 1
+ essentially ignores `image`.
+ num_inference_steps (`int`, *optional*, defaults to 12):
+ The number of denoising steps. More denoising steps usually lead to a higher quality image at the
+ expense of slower inference.
+ guidance_scale (`float`, *optional*, defaults to 10.0):
+ A higher guidance scale value encourages the model to generate images closely linked to the text
+ `prompt` at the expense of lower image quality. Guidance scale is enabled when `guidance_scale > 1`.
+ negative_prompt (`str` or `List[str]`, *optional*):
+ The prompt or prompts to guide what to not include in image generation. If not defined, you need to
+ pass `negative_prompt_embeds` instead. Ignored when not using guidance (`guidance_scale < 1`).
+ num_images_per_prompt (`int`, *optional*, defaults to 1):
+ The number of images to generate per prompt.
+ generator (`torch.Generator`, *optional*):
+ A [`torch.Generator`](https://pytorch.org/docs/stable/generated/torch.Generator.html) to make
+ generation deterministic.
+ prompt_embeds (`torch.Tensor`, *optional*):
+ Pre-generated text embeddings. Can be used to easily tweak text inputs (prompt weighting). If not
+ provided, text embeddings are generated from the `prompt` input argument. A single vector from the
+ pooled and projected final hidden states.
+ encoder_hidden_states (`torch.Tensor`, *optional*):
+ Pre-generated penultimate hidden states from the text encoder providing additional text conditioning.
+ negative_prompt_embeds (`torch.Tensor`, *optional*):
+ Pre-generated negative text embeddings. Can be used to easily tweak text inputs (prompt weighting). If
+ not provided, `negative_prompt_embeds` are generated from the `negative_prompt` input argument.
+ negative_encoder_hidden_states (`torch.Tensor`, *optional*):
+ Analogous to `encoder_hidden_states` for the positive prompt.
+ output_type (`str`, *optional*, defaults to `"pil"`):
+ The output format of the generated image. Choose between `PIL.Image` or `np.array`.
+ return_dict (`bool`, *optional*, defaults to `True`):
+ Whether or not to return a [`~pipelines.stable_diffusion.StableDiffusionPipelineOutput`] instead of a
+ plain tuple.
+ callback (`Callable`, *optional*):
+ A function that calls every `callback_steps` steps during inference. The function is called with the
+ following arguments: `callback(step: int, timestep: int, latents: torch.Tensor)`.
+ callback_steps (`int`, *optional*, defaults to 1):
+ The frequency at which the `callback` function is called. If not specified, the callback is called at
+ every step.
+ cross_attention_kwargs (`dict`, *optional*):
+ A kwargs dictionary that if specified is passed along to the [`AttentionProcessor`] as defined in
+ [`self.processor`](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/attention_processor.py).
+ micro_conditioning_aesthetic_score (`int`, *optional*, defaults to 6):
+ The targeted aesthetic score according to the laion aesthetic classifier. See
+ https://laion.ai/blog/laion-aesthetics/ and the micro-conditioning section of
+ https://arxiv.org/abs/2307.01952.
+ micro_conditioning_crop_coord (`Tuple[int]`, *optional*, defaults to (0, 0)):
+ The targeted height, width crop coordinates. See the micro-conditioning section of
+ https://arxiv.org/abs/2307.01952.
+ temperature (`Union[int, Tuple[int, int], List[int]]`, *optional*, defaults to (2, 0)):
+ Configures the temperature scheduler on `self.scheduler` see `Scheduler#set_timesteps`.
+
+ Examples:
+
+ Returns:
+ [`~pipelines.pipeline_utils.ImagePipelineOutput`] or `tuple`:
+ If `return_dict` is `True`, [`~pipelines.pipeline_utils.ImagePipelineOutput`] is returned, otherwise a
+ `tuple` is returned where the first element is a list with the generated images.
+ """
+
+ if (prompt_embeds is not None and encoder_hidden_states is None) or (
+ prompt_embeds is None and encoder_hidden_states is not None
+ ):
+ raise ValueError("pass either both `prompt_embeds` and `encoder_hidden_states` or neither")
+
+ if (negative_prompt_embeds is not None and negative_encoder_hidden_states is None) or (
+ negative_prompt_embeds is None and negative_encoder_hidden_states is not None
+ ):
+ raise ValueError(
+ "pass either both `negative_prompt_embeds` and `negative_encoder_hidden_states` or neither"
+ )
+
+ if (prompt is None and prompt_embeds is None) or (prompt is not None and prompt_embeds is not None):
+ raise ValueError("pass only one of `prompt` or `prompt_embeds`")
+
+ if isinstance(prompt, str):
+ prompt = [prompt]
+
+ if prompt is not None:
+ batch_size = len(prompt)
+ else:
+ batch_size = prompt_embeds.shape[0]
+
+ batch_size = batch_size * num_images_per_prompt
+
+ if prompt_embeds is None:
+ input_ids = self.tokenizer(
+ prompt,
+ return_tensors="pt",
+ padding="max_length",
+ truncation=True,
+ max_length=77, #self.tokenizer.model_max_length,
+ ).input_ids.to(self._execution_device)
+
+ outputs = self.text_encoder(input_ids, return_dict=True, output_hidden_states=True)
+ prompt_embeds = outputs.text_embeds
+ encoder_hidden_states = outputs.hidden_states[-2]
+
+ prompt_embeds = prompt_embeds.repeat(num_images_per_prompt, 1)
+ encoder_hidden_states = encoder_hidden_states.repeat(num_images_per_prompt, 1, 1)
+
+ if guidance_scale > 1.0:
+ if negative_prompt_embeds is None:
+ if negative_prompt is None:
+ negative_prompt = [""] * len(prompt)
+
+ if isinstance(negative_prompt, str):
+ negative_prompt = [negative_prompt]
+
+ input_ids = self.tokenizer(
+ negative_prompt,
+ return_tensors="pt",
+ padding="max_length",
+ truncation=True,
+ max_length=77, #self.tokenizer.model_max_length,
+ ).input_ids.to(self._execution_device)
+
+ outputs = self.text_encoder(input_ids, return_dict=True, output_hidden_states=True)
+ negative_prompt_embeds = outputs.text_embeds
+ negative_encoder_hidden_states = outputs.hidden_states[-2]
+
+ negative_prompt_embeds = negative_prompt_embeds.repeat(num_images_per_prompt, 1)
+ negative_encoder_hidden_states = negative_encoder_hidden_states.repeat(num_images_per_prompt, 1, 1)
+
+ prompt_embeds = torch.concat([negative_prompt_embeds, prompt_embeds])
+ encoder_hidden_states = torch.concat([negative_encoder_hidden_states, encoder_hidden_states])
+
+ image = self.image_processor.preprocess(image)
+
+ height, width = image.shape[-2:]
+
+ # Note that the micro conditionings _do_ flip the order of width, height for the original size
+ # and the crop coordinates. This is how it was done in the original code base
+ micro_conds = torch.tensor(
+ [
+ width,
+ height,
+ micro_conditioning_crop_coord[0],
+ micro_conditioning_crop_coord[1],
+ micro_conditioning_aesthetic_score,
+ ],
+ device=self._execution_device,
+ dtype=encoder_hidden_states.dtype,
+ )
+
+ micro_conds = micro_conds.unsqueeze(0)
+ micro_conds = micro_conds.expand(2 * batch_size if guidance_scale > 1.0 else batch_size, -1)
+
+ self.scheduler.set_timesteps(num_inference_steps, temperature, self._execution_device)
+ num_inference_steps = int(len(self.scheduler.timesteps) * strength)
+ start_timestep_idx = len(self.scheduler.timesteps) - num_inference_steps
+
+ needs_upcasting = False # = self.vqvae.dtype == torch.float16 and self.vqvae.config.force_upcast
+
+ if needs_upcasting:
+ self.vqvae.float()
+
+ latents = self.vqvae.encode(image.to(dtype=self.vqvae.dtype, device=self._execution_device)).latents
+ latents_bsz, channels, latents_height, latents_width = latents.shape
+ latents = self.vqvae.quantize(latents)[2][2].reshape(latents_bsz, latents_height, latents_width)
+ latents = self.scheduler.add_noise(
+ latents, self.scheduler.timesteps[start_timestep_idx - 1], generator=generator
+ )
+ latents = latents.repeat(num_images_per_prompt, 1, 1)
+
+ with self.progress_bar(total=num_inference_steps) as progress_bar:
+ for i in range(start_timestep_idx, len(self.scheduler.timesteps)):
+ timestep = self.scheduler.timesteps[i]
+
+ if guidance_scale > 1.0:
+ model_input = torch.cat([latents] * 2)
+ else:
+ model_input = latents
+ if height == 1024: #args.resolution == 1024:
+ img_ids = _prepare_latent_image_ids(model_input.shape[0], model_input.shape[-2],model_input.shape[-1],model_input.device,model_input.dtype)
+ else:
+ img_ids = _prepare_latent_image_ids(model_input.shape[0],2*model_input.shape[-2],2*model_input.shape[-1],model_input.device,model_input.dtype)
+ txt_ids = torch.zeros(encoder_hidden_states.shape[1],3).to(device = encoder_hidden_states.device, dtype = encoder_hidden_states.dtype)
+ model_output = self.transformer(
+ model_input,
+ micro_conds=micro_conds,
+ pooled_projections=prompt_embeds,
+ encoder_hidden_states=encoder_hidden_states,
+ # cross_attention_kwargs=cross_attention_kwargs,
+ img_ids = img_ids,
+ txt_ids = txt_ids,
+ timestep = torch.tensor([timestep], device=model_input.device, dtype=torch.long),
+ )
+
+ if guidance_scale > 1.0:
+ uncond_logits, cond_logits = model_output.chunk(2)
+ model_output = uncond_logits + guidance_scale * (cond_logits - uncond_logits)
+
+ latents = self.scheduler.step(
+ model_output=model_output,
+ timestep=timestep,
+ sample=latents,
+ generator=generator,
+ ).prev_sample
+
+ if i == len(self.scheduler.timesteps) - 1 or ((i + 1) % self.scheduler.order == 0):
+ progress_bar.update()
+ if callback is not None and i % callback_steps == 0:
+ step_idx = i // getattr(self.scheduler, "order", 1)
+ callback(step_idx, timestep, latents)
+
+ if output_type == "latent":
+ output = latents
+ else:
+ output = self.vqvae.decode(
+ latents,
+ force_not_quantize=True,
+ shape=(
+ batch_size,
+ height // self.vae_scale_factor,
+ width // self.vae_scale_factor,
+ self.vqvae.config.latent_channels,
+ ),
+ ).sample.clip(0, 1)
+ output = self.image_processor.postprocess(output, output_type)
+
+ if needs_upcasting:
+ self.vqvae.half()
+
+ self.maybe_free_model_hooks()
+
+ if not return_dict:
+ return (output,)
+
+ return ImagePipelineOutput(output)
diff --git a/Meissonic/src/pipeline_inpaint.py b/Meissonic/src/pipeline_inpaint.py
new file mode 100644
index 0000000000000000000000000000000000000000..4e8a08b0dcf357660ac9038e59b05a95eefb94a4
--- /dev/null
+++ b/Meissonic/src/pipeline_inpaint.py
@@ -0,0 +1,361 @@
+# Copyright 2024 The HuggingFace Team and The MeissonFlow Team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from typing import Any, Callable, Dict, List, Optional, Tuple, Union
+import torch
+from transformers import CLIPTextModelWithProjection, CLIPTokenizer
+from diffusers.image_processor import PipelineImageInput, VaeImageProcessor
+from diffusers.models import VQModel
+from diffusers.utils import replace_example_docstring
+from diffusers.pipelines.pipeline_utils import DiffusionPipeline, ImagePipelineOutput
+from src.scheduler import Scheduler
+from src.transformer import Transformer2DModel
+from src.pipeline import _prepare_latent_image_ids
+
+
+EXAMPLE_DOC_STRING = """
+ Examples:
+ ```py
+ >>> pipe(prompt, input_image, mask).images[0].save("out.png")
+ ```
+"""
+
+class InpaintPipeline(DiffusionPipeline):
+ image_processor: VaeImageProcessor
+ vqvae: VQModel
+ tokenizer: CLIPTokenizer
+ text_encoder: CLIPTextModelWithProjection
+ transformer: Transformer2DModel #UVit2DModel
+ scheduler: Scheduler
+
+ model_cpu_offload_seq = "text_encoder->transformer->vqvae"
+
+ # TODO - when calling self.vqvae.quantize, it uses self.vqvae.quantize.embedding.weight before
+ # the forward method of self.vqvae.quantize, so the hook doesn't get called to move the parameter
+ # off the meta device. There should be a way to fix this instead of just not offloading it
+ _exclude_from_cpu_offload = ["vqvae"]
+
+ def __init__(
+ self,
+ vqvae: VQModel,
+ tokenizer: CLIPTokenizer,
+ text_encoder: CLIPTextModelWithProjection,
+ transformer: Transformer2DModel, #UVit2DModel,
+ scheduler: Scheduler,
+ ):
+ super().__init__()
+
+ self.register_modules(
+ vqvae=vqvae,
+ tokenizer=tokenizer,
+ text_encoder=text_encoder,
+ transformer=transformer,
+ scheduler=scheduler,
+ )
+ self.vae_scale_factor = 2 ** (len(self.vqvae.config.block_out_channels) - 1)
+ self.image_processor = VaeImageProcessor(vae_scale_factor=self.vae_scale_factor, do_normalize=False)
+ self.mask_processor = VaeImageProcessor(
+ vae_scale_factor=self.vae_scale_factor,
+ do_normalize=False,
+ do_binarize=True,
+ do_convert_grayscale=True,
+ do_resize=True,
+ )
+ self.scheduler.register_to_config(masking_schedule="linear")
+
+ @torch.no_grad()
+ @replace_example_docstring(EXAMPLE_DOC_STRING)
+ def __call__(
+ self,
+ prompt: Optional[Union[List[str], str]] = None,
+ image: PipelineImageInput = None,
+ mask_image: PipelineImageInput = None,
+ strength: float = 1.0,
+ num_inference_steps: int = 12,
+ guidance_scale: float = 10.0,
+ negative_prompt: Optional[Union[str, List[str]]] = None,
+ num_images_per_prompt: Optional[int] = 1,
+ generator: Optional[torch.Generator] = None,
+ prompt_embeds: Optional[torch.Tensor] = None,
+ encoder_hidden_states: Optional[torch.Tensor] = None,
+ negative_prompt_embeds: Optional[torch.Tensor] = None,
+ negative_encoder_hidden_states: Optional[torch.Tensor] = None,
+ output_type="pil",
+ return_dict: bool = True,
+ callback: Optional[Callable[[int, int, torch.Tensor], None]] = None,
+ callback_steps: int = 1,
+ cross_attention_kwargs: Optional[Dict[str, Any]] = None,
+ micro_conditioning_aesthetic_score: int = 6,
+ micro_conditioning_crop_coord: Tuple[int, int] = (0, 0),
+ temperature: Union[int, Tuple[int, int], List[int]] = (2, 0),
+ ):
+ """
+ The call function to the pipeline for generation.
+
+ Args:
+ prompt (`str` or `List[str]`, *optional*):
+ The prompt or prompts to guide image generation. If not defined, you need to pass `prompt_embeds`.
+ image (`torch.Tensor`, `PIL.Image.Image`, `np.ndarray`, `List[torch.Tensor]`, `List[PIL.Image.Image]`, or `List[np.ndarray]`):
+ `Image`, numpy array or tensor representing an image batch to be used as the starting point. For both
+ numpy array and pytorch tensor, the expected value range is between `[0, 1]` If it's a tensor or a list
+ or tensors, the expected shape should be `(B, C, H, W)` or `(C, H, W)`. If it is a numpy array or a
+ list of arrays, the expected shape should be `(B, H, W, C)` or `(H, W, C)` It can also accept image
+ latents as `image`, but if passing latents directly it is not encoded again.
+ mask_image (`torch.Tensor`, `PIL.Image.Image`, `np.ndarray`, `List[torch.Tensor]`, `List[PIL.Image.Image]`, or `List[np.ndarray]`):
+ `Image`, numpy array or tensor representing an image batch to mask `image`. White pixels in the mask
+ are repainted while black pixels are preserved. If `mask_image` is a PIL image, it is converted to a
+ single channel (luminance) before use. If it's a numpy array or pytorch tensor, it should contain one
+ color channel (L) instead of 3, so the expected shape for pytorch tensor would be `(B, 1, H, W)`, `(B,
+ H, W)`, `(1, H, W)`, `(H, W)`. And for numpy array would be for `(B, H, W, 1)`, `(B, H, W)`, `(H, W,
+ 1)`, or `(H, W)`.
+ strength (`float`, *optional*, defaults to 1.0):
+ Indicates extent to transform the reference `image`. Must be between 0 and 1. `image` is used as a
+ starting point and more noise is added the higher the `strength`. The number of denoising steps depends
+ on the amount of noise initially added. When `strength` is 1, added noise is maximum and the denoising
+ process runs for the full number of iterations specified in `num_inference_steps`. A value of 1
+ essentially ignores `image`.
+ num_inference_steps (`int`, *optional*, defaults to 16):
+ The number of denoising steps. More denoising steps usually lead to a higher quality image at the
+ expense of slower inference.
+ guidance_scale (`float`, *optional*, defaults to 10.0):
+ A higher guidance scale value encourages the model to generate images closely linked to the text
+ `prompt` at the expense of lower image quality. Guidance scale is enabled when `guidance_scale > 1`.
+ negative_prompt (`str` or `List[str]`, *optional*):
+ The prompt or prompts to guide what to not include in image generation. If not defined, you need to
+ pass `negative_prompt_embeds` instead. Ignored when not using guidance (`guidance_scale < 1`).
+ num_images_per_prompt (`int`, *optional*, defaults to 1):
+ The number of images to generate per prompt.
+ generator (`torch.Generator`, *optional*):
+ A [`torch.Generator`](https://pytorch.org/docs/stable/generated/torch.Generator.html) to make
+ generation deterministic.
+ prompt_embeds (`torch.Tensor`, *optional*):
+ Pre-generated text embeddings. Can be used to easily tweak text inputs (prompt weighting). If not
+ provided, text embeddings are generated from the `prompt` input argument. A single vector from the
+ pooled and projected final hidden states.
+ encoder_hidden_states (`torch.Tensor`, *optional*):
+ Pre-generated penultimate hidden states from the text encoder providing additional text conditioning.
+ negative_prompt_embeds (`torch.Tensor`, *optional*):
+ Pre-generated negative text embeddings. Can be used to easily tweak text inputs (prompt weighting). If
+ not provided, `negative_prompt_embeds` are generated from the `negative_prompt` input argument.
+ negative_encoder_hidden_states (`torch.Tensor`, *optional*):
+ Analogous to `encoder_hidden_states` for the positive prompt.
+ output_type (`str`, *optional*, defaults to `"pil"`):
+ The output format of the generated image. Choose between `PIL.Image` or `np.array`.
+ return_dict (`bool`, *optional*, defaults to `True`):
+ Whether or not to return a [`~pipelines.stable_diffusion.StableDiffusionPipelineOutput`] instead of a
+ plain tuple.
+ callback (`Callable`, *optional*):
+ A function that calls every `callback_steps` steps during inference. The function is called with the
+ following arguments: `callback(step: int, timestep: int, latents: torch.Tensor)`.
+ callback_steps (`int`, *optional*, defaults to 1):
+ The frequency at which the `callback` function is called. If not specified, the callback is called at
+ every step.
+ cross_attention_kwargs (`dict`, *optional*):
+ A kwargs dictionary that if specified is passed along to the [`AttentionProcessor`] as defined in
+ [`self.processor`](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/attention_processor.py).
+ micro_conditioning_aesthetic_score (`int`, *optional*, defaults to 6):
+ The targeted aesthetic score according to the laion aesthetic classifier. See
+ https://laion.ai/blog/laion-aesthetics/ and the micro-conditioning section of
+ https://arxiv.org/abs/2307.01952.
+ micro_conditioning_crop_coord (`Tuple[int]`, *optional*, defaults to (0, 0)):
+ The targeted height, width crop coordinates. See the micro-conditioning section of
+ https://arxiv.org/abs/2307.01952.
+ temperature (`Union[int, Tuple[int, int], List[int]]`, *optional*, defaults to (2, 0)):
+ Configures the temperature scheduler on `self.scheduler` see `Scheduler#set_timesteps`.
+
+ Examples:
+
+ Returns:
+ [`~pipelines.pipeline_utils.ImagePipelineOutput`] or `tuple`:
+ If `return_dict` is `True`, [`~pipelines.pipeline_utils.ImagePipelineOutput`] is returned, otherwise a
+ `tuple` is returned where the first element is a list with the generated images.
+ """
+
+ if (prompt_embeds is not None and encoder_hidden_states is None) or (
+ prompt_embeds is None and encoder_hidden_states is not None
+ ):
+ raise ValueError("pass either both `prompt_embeds` and `encoder_hidden_states` or neither")
+
+ if (negative_prompt_embeds is not None and negative_encoder_hidden_states is None) or (
+ negative_prompt_embeds is None and negative_encoder_hidden_states is not None
+ ):
+ raise ValueError(
+ "pass either both `negatve_prompt_embeds` and `negative_encoder_hidden_states` or neither"
+ )
+
+ if (prompt is None and prompt_embeds is None) or (prompt is not None and prompt_embeds is not None):
+ raise ValueError("pass only one of `prompt` or `prompt_embeds`")
+
+ if isinstance(prompt, str):
+ prompt = [prompt]
+
+ if prompt is not None:
+ batch_size = len(prompt)
+ else:
+ batch_size = prompt_embeds.shape[0]
+
+ batch_size = batch_size * num_images_per_prompt
+
+ if prompt_embeds is None:
+ input_ids = self.tokenizer(
+ prompt,
+ return_tensors="pt",
+ padding="max_length",
+ truncation=True,
+ max_length=77, #self.tokenizer.model_max_length,
+ ).input_ids.to(self._execution_device)
+
+ outputs = self.text_encoder(input_ids, return_dict=True, output_hidden_states=True)
+ prompt_embeds = outputs.text_embeds
+ encoder_hidden_states = outputs.hidden_states[-2]
+
+ prompt_embeds = prompt_embeds.repeat(num_images_per_prompt, 1)
+ encoder_hidden_states = encoder_hidden_states.repeat(num_images_per_prompt, 1, 1)
+
+ if guidance_scale > 1.0:
+ if negative_prompt_embeds is None:
+ if negative_prompt is None:
+ negative_prompt = [""] * len(prompt)
+
+ if isinstance(negative_prompt, str):
+ negative_prompt = [negative_prompt]
+
+ input_ids = self.tokenizer(
+ negative_prompt,
+ return_tensors="pt",
+ padding="max_length",
+ truncation=True,
+ max_length=77, #self.tokenizer.model_max_length,
+ ).input_ids.to(self._execution_device)
+
+ outputs = self.text_encoder(input_ids, return_dict=True, output_hidden_states=True)
+ negative_prompt_embeds = outputs.text_embeds
+ negative_encoder_hidden_states = outputs.hidden_states[-2]
+
+ negative_prompt_embeds = negative_prompt_embeds.repeat(num_images_per_prompt, 1)
+ negative_encoder_hidden_states = negative_encoder_hidden_states.repeat(num_images_per_prompt, 1, 1)
+
+ prompt_embeds = torch.concat([negative_prompt_embeds, prompt_embeds])
+ encoder_hidden_states = torch.concat([negative_encoder_hidden_states, encoder_hidden_states])
+
+ image = self.image_processor.preprocess(image)
+
+ height, width = image.shape[-2:]
+
+ # Note that the micro conditionings _do_ flip the order of width, height for the original size
+ # and the crop coordinates. This is how it was done in the original code base
+ micro_conds = torch.tensor(
+ [
+ width,
+ height,
+ micro_conditioning_crop_coord[0],
+ micro_conditioning_crop_coord[1],
+ micro_conditioning_aesthetic_score,
+ ],
+ device=self._execution_device,
+ dtype=encoder_hidden_states.dtype,
+ )
+
+ micro_conds = micro_conds.unsqueeze(0)
+ micro_conds = micro_conds.expand(2 * batch_size if guidance_scale > 1.0 else batch_size, -1)
+
+ self.scheduler.set_timesteps(num_inference_steps, temperature, self._execution_device)
+ num_inference_steps = int(len(self.scheduler.timesteps) * strength)
+ start_timestep_idx = len(self.scheduler.timesteps) - num_inference_steps
+
+ needs_upcasting = False #self.vqvae.dtype == torch.float16 and self.vqvae.config.force_upcast
+
+ if needs_upcasting:
+ self.vqvae.float()
+
+ latents = self.vqvae.encode(image.to(dtype=self.vqvae.dtype, device=self._execution_device)).latents
+ latents_bsz, channels, latents_height, latents_width = latents.shape
+ latents = self.vqvae.quantize(latents)[2][2].reshape(latents_bsz, latents_height, latents_width)
+
+ mask = self.mask_processor.preprocess(
+ mask_image, height // self.vae_scale_factor, width // self.vae_scale_factor
+ )
+ mask = mask.reshape(mask.shape[0], latents_height, latents_width).bool().to(latents.device)
+ latents[mask] = self.scheduler.config.mask_token_id
+
+ starting_mask_ratio = mask.sum() / latents.numel()
+
+ latents = latents.repeat(num_images_per_prompt, 1, 1)
+
+ with self.progress_bar(total=num_inference_steps) as progress_bar:
+ for i in range(start_timestep_idx, len(self.scheduler.timesteps)):
+ timestep = self.scheduler.timesteps[i]
+
+ if guidance_scale > 1.0:
+ model_input = torch.cat([latents] * 2)
+ else:
+ model_input = latents
+
+ if height == 1024: #args.resolution == 1024:
+ img_ids = _prepare_latent_image_ids(model_input.shape[0], model_input.shape[-2],model_input.shape[-1],model_input.device,model_input.dtype)
+ else:
+ img_ids = _prepare_latent_image_ids(model_input.shape[0],2*model_input.shape[-2],2*model_input.shape[-1],model_input.device,model_input.dtype)
+ txt_ids = torch.zeros(encoder_hidden_states.shape[1],3).to(device = encoder_hidden_states.device, dtype = encoder_hidden_states.dtype)
+ model_output = self.transformer(
+ model_input,
+ micro_conds=micro_conds,
+ pooled_projections=prompt_embeds,
+ encoder_hidden_states=encoder_hidden_states,
+ # cross_attention_kwargs=cross_attention_kwargs,
+ img_ids = img_ids,
+ txt_ids = txt_ids,
+ timestep = torch.tensor([timestep], device=model_input.device, dtype=torch.long),
+ )
+
+ if guidance_scale > 1.0:
+ uncond_logits, cond_logits = model_output.chunk(2)
+ model_output = uncond_logits + guidance_scale * (cond_logits - uncond_logits)
+
+ latents = self.scheduler.step(
+ model_output=model_output,
+ timestep=timestep,
+ sample=latents,
+ generator=generator,
+ starting_mask_ratio=starting_mask_ratio,
+ ).prev_sample
+
+ if i == len(self.scheduler.timesteps) - 1 or ((i + 1) % self.scheduler.order == 0):
+ progress_bar.update()
+ if callback is not None and i % callback_steps == 0:
+ step_idx = i // getattr(self.scheduler, "order", 1)
+ callback(step_idx, timestep, latents)
+
+ if output_type == "latent":
+ output = latents
+ else:
+ output = self.vqvae.decode(
+ latents,
+ force_not_quantize=True,
+ shape=(
+ batch_size,
+ height // self.vae_scale_factor,
+ width // self.vae_scale_factor,
+ self.vqvae.config.latent_channels,
+ ),
+ ).sample.clip(0, 1)
+ output = self.image_processor.postprocess(output, output_type)
+
+ if needs_upcasting:
+ self.vqvae.half()
+
+ self.maybe_free_model_hooks()
+
+ if not return_dict:
+ return (output,)
+
+ return ImagePipelineOutput(output)
diff --git a/Meissonic/src/pipeline_video.py b/Meissonic/src/pipeline_video.py
new file mode 100644
index 0000000000000000000000000000000000000000..3c81182f25ce5d192a06cffc736bc1f519ff1a8f
--- /dev/null
+++ b/Meissonic/src/pipeline_video.py
@@ -0,0 +1,1139 @@
+# Copyright 2024 The HuggingFace Team and The MeissonFlow Team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import sys
+import os
+# Add project root to path to allow imports when running as script
+sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
+
+from typing import Any, Callable, Dict, List, Optional, Tuple, Union
+from dataclasses import dataclass
+import torch
+import torch.nn as nn
+from transformers import T5Tokenizer, T5EncoderModel
+from diffusers.utils import replace_example_docstring
+from diffusers.pipelines.pipeline_utils import DiffusionPipeline
+
+from src.scheduler_video import Scheduler
+from src.transformer_video import WanDiscreteVideoTransformer
+
+# Global debug flag - set to False to disable debug prints
+DEBUG_PIPELINE = False
+
+
+@dataclass
+class VideoPipelineOutput:
+ """
+ Output class for video generation pipelines.
+
+ Args:
+ videos: Generated videos. Can be:
+ - torch.Tensor of shape [B, C, F, H, W] when output_type="pt"
+ - List of tensors (one per batch) when output_type="latent"
+ - numpy.ndarray when output_type="np"
+ - List of PIL Images (frames) when output_type="pil"
+ """
+ videos: Union[torch.Tensor, List[torch.Tensor], List, Any]
+
+
+class CosmosVideoTokenizer(nn.Module):
+ """
+ Wrapper around a Cosmos DV (Discrete Video) tokenizer for video encoding/decoding.
+
+ This class provides a clean interface to encode videos into discrete codes and decode
+ them back to video tensors. It wraps the Cosmos DV tokenizer models loaded from HuggingFace.
+
+ Attributes:
+ t_downsample (int): Temporal compression factor (frames downsampled by this factor).
+ h_downsample (int): Height compression factor (height downsampled by this factor).
+ w_downsample (int): Width compression factor (width downsampled by this factor).
+ codebook_size (int): Number of unique discrete codes in the codebook.
+ mask_token_id (int): Token ID used for masking during diffusion. Set to codebook_size,
+ meaning valid token indices are [0, codebook_size-1], and codebook_size
+ is reserved for masking.
+ """
+
+ def __init__(self, model_id: str, device: torch.device, dtype: torch.dtype):
+ """
+ Initialize the Cosmos DV video tokenizer.
+
+ Args:
+ model_id (str): HuggingFace model identifier (e.g., "Cosmos-Tokenizer-DV8x16x16-720p").
+ Can be a full repo_id like "nvidia/Cosmos-0.1-Tokenizer-DV8x16x16".
+ device (torch.device): Device to load the tokenizer on (e.g., "cuda" or "cpu").
+ dtype (torch.dtype): Data type for the tokenizer (e.g., torch.float32, torch.bfloat16).
+ """
+ super().__init__()
+ self.device = device
+ self.dtype = dtype
+ self.model_id = model_id
+
+ # Try to load from HuggingFace Hub
+ try:
+ # from huggingface_hub import snapshot_download
+ # import os
+
+ # # Handle both full repo_id and just model name
+ # if "/" not in model_id:
+ # repo_id = f"nvidia/{model_id}"
+ # else:
+ # repo_id = model_id
+
+ # # Download the model
+ # local_dir = f"pretrained_ckpts/{model_id.replace('/', '_')}"
+ # os.makedirs(local_dir, exist_ok=True)
+ # snapshot_download(repo_id=repo_id, local_dir=local_dir)
+ local_dir = "/mnt/Meissonic/pretrained_ckpts/Cosmos-0.1-Tokenizer-DV4x8x8"
+
+ # Try loading as torch.jit models
+ encoder_path = f"{local_dir}/encoder.jit"
+ decoder_path = f"{local_dir}/decoder.jit"
+
+ if os.path.exists(encoder_path) and os.path.exists(decoder_path):
+ # Load models in float32 (TorchScript models often don't support dtype conversion well)
+ # We'll convert inputs to float32 when needed
+ self.encoder = torch.jit.load(encoder_path).to(device).eval()
+ self.decoder = torch.jit.load(decoder_path).to(device).eval()
+ # Store the model dtype (typically float32 for TorchScript)
+ self.model_dtype = torch.float32
+ else:
+ # Try alternative loading methods (e.g., from diffusers or transformers)
+ raise FileNotFoundError(f"Could not find encoder.jit or decoder.jit in {local_dir}")
+
+ except Exception as e:
+ # Fallback: try loading via diffusers or other methods
+ try:
+ # Alternative: try AutoModel or other loading methods
+ # This is a placeholder - adjust based on actual Cosmos tokenizer API
+ raise NotImplementedError(
+ f"Failed to load Cosmos tokenizer from {model_id}. "
+ f"Error: {e}. Please ensure the model is available on HuggingFace Hub."
+ )
+ except Exception as e2:
+ raise RuntimeError(
+ f"Could not load Cosmos tokenizer: {e}. "
+ f"Please check that the model_id '{model_id}' is correct and accessible."
+ ) from e2
+
+ # Compression factors for DV8x16x16 model
+ # These values depend on the specific model architecture
+ # For Cosmos-Tokenizer-DV8x16x16-720p:
+ # - Temporal: 8x downsampling (16 frames -> 2 frames)
+ # - Spatial: 16x16 downsampling (480x848 -> 30x53)
+ self.t_downsample = 4 #8 # Temporal compression factor
+ self.h_downsample = 8 #16 # Height compression factor
+ self.w_downsample = 8 #16 # Width compression factor
+ self.codebook_size = 64000 #65536 # Number of unique codes (2^16)
+
+ # Mask token ID: codebook_size is reserved for masking during diffusion
+ # This ensures all Cosmos codes [0, codebook_size-1] remain valid
+ # Extended vocab: [0, codebook_size-1] = valid codes, codebook_size = mask_token_id
+ self.mask_token_id = self.codebook_size
+
+ def encode(self, video: torch.Tensor) -> torch.LongTensor:
+ """
+ Encode a video tensor into discrete code indices.
+
+ Args:
+ video (torch.Tensor): Input video tensor of shape [B, C, F, H, W].
+ Values should be in [0, 1] or [0, 255].
+ If values are > 1.0, they will be normalized to [0, 1].
+
+ Returns:
+ torch.LongTensor: Encoded discrete codes of shape [B, F', H', W'] where:
+ - F' ≈ F // t_downsample (temporal dimension after compression, may vary slightly due to padding)
+ - H' = H // h_downsample (height after compression)
+ - W' = W // w_downsample (width after compression)
+
+ Note:
+ The actual output shape may differ slightly from the theoretical compression
+ due to model-specific padding or overlap behavior.
+ """
+ # Normalize video to [0, 1] if necessary
+ if video.max() > 1.0:
+ video = video / 255.0
+
+ # Ensure video is on correct device and convert to model dtype (typically float32)
+ # TorchScript models often require float32 inputs
+ video = video.to(self.device).to(self.model_dtype)
+
+ # Encode the video
+ with torch.no_grad():
+ # The encoder typically returns (indices, ...) or just indices
+ # Adjust based on actual Cosmos encoder API
+ if hasattr(self.encoder, 'encode'):
+ result = self.encoder.encode(video)
+ if isinstance(result, tuple):
+ indices = result[0]
+ else:
+ indices = result
+ else:
+ # Direct call if encoder is a callable model
+ result = self.encoder(video)
+ if isinstance(result, tuple):
+ indices = result[0]
+ else:
+ indices = result
+
+ # Ensure indices are LongTensor
+ if not isinstance(indices, torch.LongTensor):
+ indices = indices.long()
+
+ return indices
+
+ def decode(self, codes: torch.LongTensor) -> torch.Tensor:
+ """
+ Decode discrete code indices back into a video tensor.
+
+ Args:
+ codes (torch.LongTensor): Encoded discrete codes of shape [B, F', H', W'].
+
+ Returns:
+ torch.Tensor: Reconstructed video tensor of shape [B, C, F, H, W] where:
+ - F ≈ F' * t_downsample (may vary slightly due to model-specific behavior)
+ - H = H' * h_downsample
+ - W = W' * w_downsample
+ Values are in [0, 1] range.
+
+ Note:
+ The output frame count may differ slightly from the original input due to
+ model-specific temporal interpolation or padding behavior.
+ """
+ # Ensure codes are on correct device
+ codes = codes.to(self.device)
+
+ # Decode the codes
+ with torch.no_grad():
+ if hasattr(self.decoder, 'decode'):
+ reconstructed_video = self.decoder.decode(codes)
+ else:
+ # Direct call if decoder is a callable model
+ reconstructed_video = self.decoder(codes)
+
+ # Ensure output is in [0, 1] range and convert to desired dtype
+ reconstructed_video = torch.clamp(reconstructed_video, 0.0, 1.0)
+
+ # Convert to the tokenizer's dtype if different from model dtype
+ if reconstructed_video.dtype != self.dtype:
+ reconstructed_video = reconstructed_video.to(self.dtype)
+
+ return reconstructed_video
+
+
+EXAMPLE_DOC_STRING = """
+ Examples:
+ ```py
+ >>> image = pipe(prompt).images[0]
+ ```
+"""
+
+
+class Pipeline(DiffusionPipeline):
+ tokenizer: T5Tokenizer
+ text_encoder: T5EncoderModel
+ transformer: WanDiscreteVideoTransformer
+ scheduler: Scheduler
+ video_tokenizer: CosmosVideoTokenizer
+
+ model_cpu_offload_seq = "text_encoder->transformer->video_tokenizer"
+
+ def __init__(
+ self,
+ tokenizer: T5Tokenizer,
+ text_encoder: T5EncoderModel,
+ transformer: WanDiscreteVideoTransformer,
+ scheduler: Scheduler,
+ video_tokenizer: CosmosVideoTokenizer,
+ text_len: int = 512,
+ num_frames: int = 16,
+ height: int = 480,
+ width: int = 848,
+ ):
+ """
+ Initialize the video diffusion pipeline.
+
+ Args:
+ tokenizer (T5Tokenizer): Wan-style T5 tokenizer (UMT5) for text encoding.
+ text_encoder (T5EncoderModel): Wan-style T5 encoder (UMT5-base outputs 768, UMT5-large outputs 4096).
+ transformer (WanDiscreteVideoTransformer): The discrete video transformer model
+ that handles token embedding and logits prediction. Supports dynamic input dimensions.
+ scheduler (Scheduler): The diffusion scheduler.
+ video_tokenizer (CosmosVideoTokenizer): Cosmos DV tokenizer
+ for video encoding/decoding. Required for video generation.
+ text_len (int): Maximum text sequence length (default: 512).
+ num_frames (int): Default number of frames in the video (default: 16).
+ Can be overridden in __call__. Must be divisible by tokenizer's t_downsample (8).
+ height (int): Default height of the video in pixels (default: 480).
+ Can be overridden in __call__. Must be divisible by tokenizer's h_downsample (16).
+ width (int): Default width of the video in pixels (default: 848).
+ Can be overridden in __call__. Must be divisible by tokenizer's w_downsample (16).
+
+ Note:
+ The transformer now supports dynamic input dimensions, so users can generate videos
+ with different frame counts and resolutions by specifying them in __call__().
+ """
+ super().__init__()
+
+ self.register_modules(
+ tokenizer=tokenizer,
+ text_encoder=text_encoder,
+ transformer=transformer,
+ scheduler=scheduler,
+ video_tokenizer=video_tokenizer,
+ )
+ self.text_len = text_len
+ # Store default video dimensions (can be overridden in __call__)
+ self.num_frames = num_frames
+ self.height = height
+ self.width = width
+
+ # Get codebook size from video tokenizer
+ self.codebook_size = video_tokenizer.codebook_size
+
+ # IMPORTANT: Index mapping semantics for discrete diffusion
+ # ============================================================
+ # Cosmos tokenizer outputs indices in [0, codebook_size-1] (all valid codes, no mask)
+ # We extend the vocab by adding mask_token_id = codebook_size
+ #
+ # Model vocab: [0, codebook_size] where:
+ # - [0, codebook_size-1] = actual Cosmos codes (direct mapping, no shift needed)
+ # - codebook_size = mask_token_id (reserved for masking during diffusion)
+ #
+ # This design ensures:
+ # - All Cosmos codes remain valid (no information loss)
+ # - mask_token_id is outside the codebook range, safe for masking
+ # - No need for +1/-1 mapping, Cosmos codes map directly to [0, codebook_size-1]
+ #
+ # When decoding back to Cosmos:
+ # - Model outputs in [0, codebook_size] (may contain codebook_size=mask)
+ # - Filter out mask tokens: clamp to [0, codebook_size-1]
+ # - This ensures Cosmos only sees valid codes [0, codebook_size-1]
+ # ============================================================
+
+ # Set mask_token_id to codebook_size (outside valid code range)
+ # This will be used by scheduler
+ self.mask_token_id = self.codebook_size
+
+ # Calculate default compressed dimensions
+ # These are used as defaults in __call__ and can be overridden
+ self.F_prime = num_frames // video_tokenizer.t_downsample
+ self.H_prime = height // video_tokenizer.h_downsample
+ self.W_prime = width // video_tokenizer.w_downsample
+
+ def _encode_prompt_wan(
+ self,
+ prompt: Union[str, List[str]],
+ negative_prompt: Optional[Union[str, List[str]]] = None,
+ num_images_per_prompt: int = 1,
+ do_classifier_free_guidance: bool = True,
+ ) -> Tuple[torch.Tensor, Optional[torch.Tensor]]:
+ """
+ Encode prompt(s) using Wan's T5 text encoder.
+
+ Args:
+ prompt (Union[str, List[str]]): The prompt or prompts to encode.
+ negative_prompt (Optional[Union[str, List[str]]]): The negative prompt(s) for CFG.
+ num_images_per_prompt (int): Number of images to generate per prompt.
+ do_classifier_free_guidance (bool): Whether to use classifier-free guidance.
+
+ Returns:
+ Tuple[torch.Tensor, Optional[torch.Tensor]]:
+ - encoder_hidden_states: [B_total, L_text, D_text] where B_total = B * num_images_per_prompt
+ - encoder_hidden_states_neg: [B_total, L_text, D_text] for CFG, or None if not using CFG
+ """
+ if isinstance(prompt, str):
+ prompt = [prompt]
+
+ # Tokenize prompts
+ input_ids = self.tokenizer(
+ prompt,
+ padding="max_length",
+ truncation=True,
+ max_length=self.text_len,
+ return_tensors="pt"
+ )["input_ids"].to(self._execution_device)
+
+ # Encode prompts
+ with torch.no_grad():
+ outputs = self.text_encoder(input_ids, return_dict=True)
+ encoder_hidden_states = outputs.last_hidden_state # [B, L_text, D_text]
+
+ # Repeat for num_images_per_prompt
+ encoder_hidden_states = encoder_hidden_states.repeat(num_images_per_prompt, 1, 1)
+
+ # Handle negative prompts for CFG
+ encoder_hidden_states_neg = None
+ if do_classifier_free_guidance:
+ if negative_prompt is None:
+ negative_prompt = [""] * len(prompt)
+
+ if isinstance(negative_prompt, str):
+ negative_prompt = [negative_prompt]
+
+ # Tokenize negative prompts
+ negative_input_ids = self.tokenizer(
+ negative_prompt,
+ padding="max_length",
+ truncation=True,
+ max_length=self.text_len,
+ return_tensors="pt"
+ )["input_ids"].to(self._execution_device)
+
+ # Encode negative prompts
+ with torch.no_grad():
+ negative_outputs = self.text_encoder(negative_input_ids, return_dict=True)
+ encoder_hidden_states_neg = negative_outputs.last_hidden_state # [B, L_text, D_text]
+
+ # Repeat for num_images_per_prompt
+ encoder_hidden_states_neg = encoder_hidden_states_neg.repeat(num_images_per_prompt, 1, 1)
+
+ # Assertions for shape verification
+ B_total = len(prompt) * num_images_per_prompt
+ assert encoder_hidden_states.shape == (B_total, self.text_len, encoder_hidden_states.shape[-1]), (
+ f"Expected encoder_hidden_states shape ({B_total}, {self.text_len}, D_text), "
+ f"got {encoder_hidden_states.shape}"
+ )
+ if encoder_hidden_states_neg is not None:
+ assert encoder_hidden_states_neg.shape == (B_total, self.text_len, encoder_hidden_states_neg.shape[-1]), (
+ f"Expected encoder_hidden_states_neg shape ({B_total}, {self.text_len}, D_text), "
+ f"got {encoder_hidden_states_neg.shape}"
+ )
+
+ return encoder_hidden_states, encoder_hidden_states_neg
+
+ @torch.no_grad()
+ @replace_example_docstring(EXAMPLE_DOC_STRING)
+ def __call__(
+ self,
+ prompt: Optional[Union[List[str], str]] = None,
+ num_frames: Optional[int] = None,
+ height: Optional[int] = None,
+ width: Optional[int] = None,
+ num_inference_steps: int = 48,
+ guidance_scale: float = 9.0,
+ negative_prompt: Optional[Union[str, List[str]]] = None,
+ num_images_per_prompt: Optional[int] = 1,
+ generator: Optional[torch.Generator] = None,
+ latents: Optional[torch.IntTensor] = None,
+ encoder_hidden_states: Optional[torch.Tensor] = None,
+ negative_encoder_hidden_states: Optional[torch.Tensor] = None,
+ output_type="pil",
+ return_dict: bool = True,
+ callback: Optional[Callable[[int, int, torch.Tensor], None]] = None,
+ callback_steps: int = 1,
+ cross_attention_kwargs: Optional[Dict[str, Any]] = None,
+ micro_conditioning_aesthetic_score: int = 6,
+ micro_conditioning_crop_coord: Tuple[int, int] = (0, 0),
+ temperature: Union[int, Tuple[int, int], List[int]] = (2, 0),
+ ):
+ """
+ The call function to the pipeline for generation.
+
+ Args:
+ prompt (`str` or `List[str]`, *optional*):
+ The prompt or prompts to guide image generation. If not defined, you need to pass `encoder_hidden_states`.
+ num_frames (`int`, *optional*, defaults to `self.num_frames`):
+ Number of frames in the generated video.
+ height (`int`, *optional*, defaults to `self.height`):
+ The height in pixels of the generated video.
+ width (`int`, *optional*, defaults to `self.width`):
+ The width in pixels of the generated video.
+ num_inference_steps (`int`, *optional*, defaults to 16):
+ The number of denoising steps. More denoising steps usually lead to a higher quality image at the
+ expense of slower inference.
+ guidance_scale (`float`, *optional*, defaults to 10.0):
+ A higher guidance scale value encourages the model to generate images closely linked to the text
+ `prompt` at the expense of lower image quality. Guidance scale is enabled when `guidance_scale > 1`.
+ negative_prompt (`str` or `List[str]`, *optional*):
+ The prompt or prompts to guide what to not include in image generation. If not defined, you need to
+ pass `negative_prompt_embeds` instead. Ignored when not using guidance (`guidance_scale < 1`).
+ num_images_per_prompt (`int`, *optional*, defaults to 1):
+ The number of images to generate per prompt.
+ generator (`torch.Generator`, *optional*):
+ A [`torch.Generator`](https://pytorch.org/docs/stable/generated/torch.Generator.html) to make
+ generation deterministic.
+ latents (`torch.LongTensor`, *optional*):
+ Pre-generated 3D video codes of shape `[B, F', H', W']` where F', H', W' are the compressed
+ dimensions after Cosmos tokenization. If not provided, the starting codes will be completely
+ masked (filled with mask_token_id).
+ encoder_hidden_states (`torch.Tensor`, *optional*):
+ Pre-generated encoder hidden states from the T5 text encoder. If not provided, will be generated
+ from the `prompt` input argument using Wan's T5 encoder.
+ negative_encoder_hidden_states (`torch.Tensor`, *optional*):
+ Pre-generated negative encoder hidden states for classifier-free guidance. If not provided, will be
+ generated from the `negative_prompt` input argument.
+ output_type (`str`, *optional*, defaults to `"pil"`):
+ The output format of the generated image. Choose between `PIL.Image` or `np.array`.
+ return_dict (`bool`, *optional*, defaults to `True`):
+ Whether or not to return a [`~pipelines.stable_diffusion.StableDiffusionPipelineOutput`] instead of a
+ plain tuple.
+ callback (`Callable`, *optional*):
+ A function that calls every `callback_steps` steps during inference. The function is called with the
+ following arguments: `callback(step: int, timestep: int, latents: torch.Tensor)`.
+ callback_steps (`int`, *optional*, defaults to 1):
+ The frequency at which the `callback` function is called. If not specified, the callback is called at
+ every step.
+ cross_attention_kwargs (`dict`, *optional*):
+ A kwargs dictionary that if specified is passed along to the [`AttentionProcessor`] as defined in
+ [`self.processor`](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/attention_processor.py).
+ micro_conditioning_aesthetic_score (`int`, *optional*, defaults to 6):
+ The targeted aesthetic score according to the laion aesthetic classifier. See
+ https://laion.ai/blog/laion-aesthetics/ and the micro-conditioning section of
+ https://arxiv.org/abs/2307.01952.
+ micro_conditioning_crop_coord (`Tuple[int]`, *optional*, defaults to (0, 0)):
+ The targeted height, width crop coordinates. See the micro-conditioning section of
+ https://arxiv.org/abs/2307.01952.
+ temperature (`Union[int, Tuple[int, int], List[int]]`, *optional*, defaults to (2, 0)):
+ Configures the temperature scheduler on `self.scheduler` see `Scheduler#set_timesteps`.
+
+ Examples:
+
+ Returns:
+ [`VideoPipelineOutput`] or `tuple`:
+ If `return_dict` is `True`, [`VideoPipelineOutput`] is returned, otherwise a
+ `tuple` is returned where the first element contains the generated videos.
+ The output format depends on `output_type`:
+ - `"latent"`: Discrete code tensor of shape `[B, F', H', W']`
+ - `"pt"`: Float tensor of shape `[B, C, F, H, W]` with values in [0, 1]
+ - `"np"`: Numpy array of shape `[B, C, F, H, W]`
+ - `"pil"`: List of lists of PIL Images (one list per batch, each containing frames)
+ """
+ # Validate inputs
+ if prompt is None and encoder_hidden_states is None:
+ raise ValueError("Either `prompt` or `encoder_hidden_states` must be provided")
+
+ if prompt is not None and encoder_hidden_states is not None:
+ raise ValueError("Cannot pass both `prompt` and `encoder_hidden_states`")
+
+ # Determine batch size
+ if prompt is not None:
+ if isinstance(prompt, str):
+ prompt = [prompt]
+ batch_size = len(prompt)
+ else:
+ batch_size = encoder_hidden_states.shape[0]
+
+ batch_size = batch_size * num_images_per_prompt
+
+ # Use provided dimensions or fall back to defaults
+ if num_frames is None:
+ num_frames = self.num_frames
+ if height is None:
+ height = self.height
+ if width is None:
+ width = self.width
+
+ # Validate dimensions are divisible by tokenizer's downsampling factors
+ t_ds = self.video_tokenizer.t_downsample
+ h_ds = self.video_tokenizer.h_downsample
+ w_ds = self.video_tokenizer.w_downsample
+
+ # if num_frames % t_ds != 0:
+ # raise ValueError(
+ # f"num_frames ({num_frames}) must be divisible by temporal downsampling factor ({t_ds})"
+ # )
+ # if height % h_ds != 0:
+ # raise ValueError(
+ # f"height ({height}) must be divisible by height downsampling factor ({h_ds})"
+ # )
+ # if width % w_ds != 0:
+ # raise ValueError(
+ # f"width ({width}) must be divisible by width downsampling factor ({w_ds})"
+ # )
+
+ # Calculate compressed dimensions for this generation
+ F_prime = num_frames // t_ds
+ H_prime = height // h_ds
+ W_prime = width // w_ds
+
+ # Encode prompts using Wan's T5 encoder
+ do_classifier_free_guidance = guidance_scale > 1.0
+
+ if encoder_hidden_states is None:
+ encoder_hidden_states, encoder_hidden_states_neg = self._encode_prompt_wan(
+ prompt=prompt,
+ negative_prompt=negative_prompt,
+ num_images_per_prompt=num_images_per_prompt,
+ do_classifier_free_guidance=do_classifier_free_guidance,
+ )
+ else:
+ # Use pre-computed encoder_hidden_states
+ encoder_hidden_states = encoder_hidden_states.repeat(num_images_per_prompt, 1, 1)
+ if do_classifier_free_guidance:
+ if negative_encoder_hidden_states is None:
+ raise ValueError("`negative_encoder_hidden_states` must be provided when using guidance_scale > 1.0")
+ encoder_hidden_states_neg = negative_encoder_hidden_states.repeat(num_images_per_prompt, 1, 1)
+ else:
+ encoder_hidden_states_neg = None
+
+ # Stack negative and positive for classifier-free guidance
+ if do_classifier_free_guidance:
+ # Stack [negative, positive] along batch dimension
+ encoder_hidden_states = torch.cat([encoder_hidden_states_neg, encoder_hidden_states], dim=0)
+ # Verify shape: should be (2 * batch_size, text_len, text_dim)
+ assert encoder_hidden_states.shape[0] == 2 * batch_size, (
+ f"Expected batch size {2 * batch_size} after CFG stacking, got {encoder_hidden_states.shape[0]}"
+ )
+
+ # Initialize 3D video codes: [B, F', H', W']
+ # Note: latents_codes use extended vocab [0, codebook_size] where:
+ # - [0, codebook_size-1] = valid Cosmos codes (direct mapping)
+ # - codebook_size = mask_token_id
+ # If provided latents are from Cosmos (range [0, codebook_size-1]), they're already correct
+ if latents is None:
+ # Start with all mask tokens (codebook_size) in extended vocab space
+ latents_codes = torch.full(
+ (batch_size, F_prime, H_prime, W_prime),
+ self.mask_token_id, # codebook_size = mask_token_id
+ dtype=torch.long,
+ device=self._execution_device
+ )
+ else:
+ # If latents are provided, assume they are already in extended vocab format [0, codebook_size]
+ # Cosmos codes [0, codebook_size-1] map directly, no shift needed
+ latents_codes = latents
+ assert latents_codes.shape[1:] == (F_prime, H_prime, W_prime), (
+ f"Expected latents shape [B, {F_prime}, {H_prime}, {W_prime}], "
+ f"got {latents_codes.shape}"
+ )
+ # Verify values are in extended vocab range [0, codebook_size]
+ assert latents_codes.min() >= 0 and latents_codes.max() <= self.codebook_size, (
+ f"Latents values should be in [0, {self.codebook_size}] (extended vocab), "
+ f"got range [{latents_codes.min()}, {latents_codes.max()}]"
+ )
+
+ # Print initial latents shape for debugging
+ if DEBUG_PIPELINE:
+ print(f"Initial latents_codes shape: {latents_codes.shape} [B, F', H', W']")
+
+ self.scheduler.set_timesteps(num_inference_steps, temperature, self._execution_device)
+
+ num_warmup_steps = len(self.scheduler.timesteps) - num_inference_steps * self.scheduler.order
+ with self.progress_bar(total=num_inference_steps) as progress_bar:
+ for i, timestep in enumerate(self.scheduler.timesteps):
+ # Handle classifier-free guidance: duplicate codes if needed
+ # IMPORTANT: Always build latents_codes_input from latents_codes (not from previous iteration's latents_codes_input)
+ if guidance_scale > 1.0:
+ latents_codes_input = torch.cat([latents_codes] * 2, dim=0)
+ batch_size_total = 2 * batch_size
+ if DEBUG_PIPELINE:
+ print(f"[DEBUG] CFG: latents_codes.shape={latents_codes.shape}, latents_codes_input.shape={latents_codes_input.shape}, encoder_hidden_states.shape={encoder_hidden_states.shape}")
+ else:
+ latents_codes_input = latents_codes
+ batch_size_total = batch_size
+ if DEBUG_PIPELINE:
+ print(f"[DEBUG] latents_codes.shape={latents_codes.shape}, latents_codes_input.shape={latents_codes_input.shape}, encoder_hidden_states.shape={encoder_hidden_states.shape}")
+
+ # Verify shapes before transformer call
+ assert latents_codes_input.shape[0] == batch_size_total, (
+ f"latents_codes_input batch mismatch: {latents_codes_input.shape[0]} != {batch_size_total}"
+ )
+ assert encoder_hidden_states.shape[0] == batch_size_total, (
+ f"encoder_hidden_states batch mismatch: {encoder_hidden_states.shape[0]} != {batch_size_total}"
+ )
+
+ # Prepare timestep tensor: [B_total]
+ timestep_tensor = torch.full(
+ (batch_size_total,),
+ timestep,
+ dtype=torch.long,
+ device=self._execution_device
+ )
+
+ # Call transformer
+ if DEBUG_PIPELINE:
+ print(f"[DEBUG] Before transformer: tokens.shape={latents_codes_input.shape}, timesteps.shape={timestep_tensor.shape}, encoder_hidden_states.shape={encoder_hidden_states.shape}")
+ logits = self.transformer(
+ tokens=latents_codes_input,
+ timesteps=timestep_tensor,
+ encoder_hidden_states=encoder_hidden_states,
+ y=None,
+ )
+ if DEBUG_PIPELINE:
+ print(f"[DEBUG] After transformer: logits.shape={logits.shape}, expected batch={batch_size_total}")
+
+ # Verify logits shape matches expected token count
+ # logits: [B_total, vocab_size, F_out, H_out, W_out] where vocab_size = codebook_size + 1
+ # latents_codes: [B_total, F', H', W'] with values in [0, vocab_size-1]
+ vocab_size = self.codebook_size + 1
+ assert logits.shape[0] == batch_size_total, (
+ f"Logits batch size mismatch: {logits.shape[0]} != {batch_size_total}"
+ )
+ assert logits.shape[1] == vocab_size, (
+ f"Logits vocab size mismatch: {logits.shape[1]} != {vocab_size} (expected codebook_size+1)"
+ )
+
+ # Apply classifier-free guidance if needed
+ # logits shape: [B_total, vocab_size, F_out, H_out, W_out]
+ if guidance_scale > 1.0:
+ uncond_logits, cond_logits = logits.chunk(2, dim=0)
+ logits = uncond_logits + guidance_scale * (cond_logits - uncond_logits)
+ # After CFG, logits batch becomes batch_size (not 2*batch_size)
+ # We'll use latents_codes (not latents_codes_input) for flattening, so no need to update it here
+
+ # Flatten video tokens for scheduler: [B, F, H, W] -> [B, N] where N = F*H*W
+ # Scheduler expects 1D token sequences, so we flatten the spatial-temporal dimensions
+ # Use logits.shape[0] as the batch size (after CFG, this is batch_size, not batch_size_total)
+ B_flat, vocab_size, F_flat, H_flat, W_flat = logits.shape
+ N = F_flat * H_flat * W_flat
+
+ # After CFG, logits batch is batch_size (not 2*batch_size)
+ # Use latents_codes (not latents_codes_input) for flattening, since latents_codes is always [B, F, H, W]
+ if DEBUG_PIPELINE:
+ print(f"[DEBUG] After CFG: logits.shape={logits.shape}, latents_codes.shape={latents_codes.shape}")
+
+ # Handle shape mismatch: transformer output may have different spatial dimensions due to patch_size
+ # Crop or pad latents_codes to match logits dimensions
+ if latents_codes.shape[1:] != (F_flat, H_flat, W_flat):
+ if DEBUG_PIPELINE:
+ print(f"[DEBUG] Shape mismatch detected: latents_codes {latents_codes.shape[1:]} != logits {logits.shape[2:]}, adjusting...")
+ B_lat, F_lat, H_lat, W_lat = latents_codes.shape
+
+ # Create a new tensor with the correct shape, filled with mask_token_id
+ old_shape = latents_codes.shape
+ new_latents_codes = torch.full(
+ (B_flat, F_flat, H_flat, W_flat),
+ self.mask_token_id,
+ dtype=latents_codes.dtype,
+ device=latents_codes.device
+ )
+
+ # Copy overlapping region from latents_codes
+ F_copy = min(F_lat, F_flat)
+ H_copy = min(H_lat, H_flat)
+ W_copy = min(W_lat, W_flat)
+ new_latents_codes[:, :F_copy, :H_copy, :W_copy] = latents_codes[:, :F_copy, :H_copy, :W_copy]
+
+ latents_codes = new_latents_codes
+ if DEBUG_PIPELINE:
+ print(f"[DEBUG] Adjusted latents_codes from {old_shape} to {latents_codes.shape} (copied {F_copy}x{H_copy}x{W_copy} region)")
+
+ # Verify shapes match after adjustment
+ assert latents_codes.shape == (B_flat, F_flat, H_flat, W_flat), (
+ f"Shape mismatch after adjustment: logits [B, vocab, F, H, W]={logits.shape}, "
+ f"latents_codes [B, F, H, W]={latents_codes.shape}"
+ )
+ assert logits.numel() == B_flat * vocab_size * N, (
+ f"logits.numel: {logits.numel()}, reshape target: {B_flat*vocab_size*N}"
+ )
+
+ tokens_flat = latents_codes.view(B_flat, N) # [B, N]
+ logits_flat = logits.permute(0, 2, 3, 4, 1).reshape(B_flat, N, vocab_size) # [B, N, vocab_size]
+ assert (tokens_flat >= 0).all() and (tokens_flat < vocab_size).all(), (
+ f"[DEBUG] Out-of-range token: min={tokens_flat.min().item()}, max={tokens_flat.max().item()}, vocab_size={vocab_size}"
+ )
+
+ # Scheduler step: update discrete codes based on logits
+ # Scheduler works on 1D token sequences [B_total, N] with logits [B_total, N, vocab]
+ scheduler_output = self.scheduler.step(
+ model_output=logits_flat, # [B_total, N, vocab]
+ timestep=timestep,
+ sample=tokens_flat, # [B_total, N]
+ generator=generator,
+ )
+
+ # Unflatten back to video grid: [B, N] -> [B, F, H, W]
+ # Note: After CFG, B_flat = batch_size (not 2*batch_size), so latents_codes_updated is already [B, F, H, W]
+ tokens_flat = scheduler_output.prev_sample # [B, N]
+ latents_codes = tokens_flat.view(B_flat, F_flat, H_flat, W_flat) # [B, F, H, W]
+
+ # No need to slice after CFG, because B_flat is already batch_size (not 2*batch_size)
+ # The CFG merging was done on logits before flattening, so latents_codes is already the correct size
+
+ if i == len(self.scheduler.timesteps) - 1 or (
+ (i + 1) > num_warmup_steps and (i + 1) % self.scheduler.order == 0
+ ):
+ progress_bar.update()
+ if callback is not None and i % callback_steps == 0:
+ step_idx = i // getattr(self.scheduler, "order", 1)
+ callback(step_idx, timestep, latents_codes)
+
+ # Final codes after denoising: [B, F', H', W']
+ final_codes = latents_codes
+
+ self.maybe_free_model_hooks()
+
+ # Handle output based on output_type
+ if output_type == "latent":
+ # Return discrete code tensor [B, F', H', W']
+ output = final_codes
+ # Verify output dtype and shape
+ assert output.dtype in (torch.long, torch.int64), \
+ f"Expected latent output dtype torch.long or torch.int64, got {output.dtype}"
+ assert output.shape == (batch_size, F_prime, H_prime, W_prime), (
+ f"Expected latent output shape {(batch_size, F_prime, H_prime, W_prime)}, "
+ f"got {output.shape}"
+ )
+ else:
+ # Decode codes back to RGB video using Cosmos tokenizer
+ # IMPORTANT: Model uses extended vocab [0, codebook_size] where:
+ # - [0, codebook_size-1] = valid Cosmos codes (direct mapping)
+ # - codebook_size = mask_token_id
+ # Cosmos expects [0, codebook_size-1], so we filter out mask tokens
+ with torch.no_grad():
+ # Map from model vocab [0, codebook_size] to Cosmos vocab [0, codebook_size-1]
+ # Clamp to [0, codebook_size-1] to filter out mask_token_id (codebook_size)
+ cosmos_codes = torch.clamp(final_codes, min=0, max=self.codebook_size - 1)
+ videos = self.video_tokenizer.decode(cosmos_codes)
+
+ # Postprocess to standard video output format
+ # videos is in [0, 1] range as float tensor
+ if output_type == "np":
+ # Convert to numpy array (convert to float32 first, numpy doesn't support bfloat16)
+ videos_cpu = videos.cpu()
+ if videos_cpu.dtype != torch.float32:
+ videos_cpu = videos_cpu.to(torch.float32)
+ output = videos_cpu.numpy()
+ elif output_type == "pil":
+ # Convert to list of PIL Images (one list per batch, each containing frames)
+ import numpy as np
+ from PIL import Image
+ output = []
+ videos_cpu = videos.cpu()
+ if videos_cpu.dtype != torch.float32:
+ videos_cpu = videos_cpu.to(torch.float32)
+ for b in range(batch_size):
+ video_frames = []
+ for f in range(videos_cpu.shape[2]): # Loop over frames
+ frame = videos_cpu[b, :, f, :, :].numpy() # [C, H, W]
+ frame = np.transpose(frame, (1, 2, 0)) # [H, W, C]
+ frame = (frame * 255).astype(np.uint8)
+ video_frames.append(Image.fromarray(frame))
+ output.append(video_frames)
+ else:
+ # output_type == "pt", keep as float tensor in [0, 1]
+ output = videos
+
+ if not return_dict:
+ return (output,)
+
+ return VideoPipelineOutput(videos=output)
+
+
+def test_cosmos_tokenizer_shapes():
+ """
+ Independent test for CosmosVideoTokenizer encode/decode shape verification.
+
+ Tests:
+ - Encode video tensor [B, C, F, H, W] -> codes [B, F', H', W']
+ - Decode codes [B, F', H', W'] -> video tensor [B, C, F, H, W]
+ - Verify shape consistency and compression factors
+ """
+ import torch
+
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+ dtype = torch.float32
+
+ print("=" * 80)
+ print("[Test] CosmosVideoTokenizer encode/decode shape test")
+ print("=" * 80)
+
+ try:
+ model_id = "Cosmos-1.0-Tokenizer-DV8x16x16"
+ print(f"Loading tokenizer: {model_id}")
+
+ video_tokenizer = CosmosVideoTokenizer(model_id=model_id, device=device, dtype=dtype)
+ print(f"✓ Tokenizer loaded")
+ print(f" Codebook size: {video_tokenizer.codebook_size}")
+ print(f" Mask token ID: {video_tokenizer.mask_token_id}")
+ print(f" Compression: {video_tokenizer.t_downsample}x{video_tokenizer.h_downsample}x{video_tokenizer.w_downsample}")
+
+ # Test encode/decode with video shape that aligns well with compression factors
+ # Use frame count that's a multiple of t_downsample to minimize rounding issues
+ # t_downsample=8, so use 16 frames (16/8=2 compressed frames)
+ # h_downsample=16, w_downsample=16, so use dimensions divisible by 16
+ B, C, F, H, W = 1, 3, 16, 480, 848
+ test_video = torch.rand(B, C, F, H, W, device=device, dtype=dtype)
+ print(f"\nInput video shape: {test_video.shape} [B, C, F, H, W]")
+
+ # Encode
+ codes = video_tokenizer.encode(test_video)
+ assert codes.ndim == 4, f"Expected codes to be 4D [B, F', H', W'], got {codes.ndim}D"
+ assert codes.shape[0] == B, f"Batch size mismatch: {codes.shape[0]} != {B}"
+
+ F_prime = codes.shape[1]
+ H_prime = codes.shape[2]
+ W_prime = codes.shape[3]
+ print(f"Encoded codes shape: {codes.shape} [B, F', H', W']")
+
+ # Verify compression factors (allow small rounding errors for temporal dimension)
+ expected_F_prime = F // video_tokenizer.t_downsample
+ assert abs(F_prime - expected_F_prime) <= 1, \
+ f"Frame compression mismatch: {F_prime} vs expected ~{expected_F_prime} (from {F} // {video_tokenizer.t_downsample})"
+
+ expected_H_prime = H // video_tokenizer.h_downsample
+ assert H_prime == expected_H_prime, \
+ f"Height compression mismatch: {H_prime} vs {expected_H_prime} (from {H} // {video_tokenizer.h_downsample})"
+
+ expected_W_prime = W // video_tokenizer.w_downsample
+ assert W_prime == expected_W_prime, \
+ f"Width compression mismatch: {W_prime} vs {expected_W_prime} (from {W} // {video_tokenizer.w_downsample})"
+
+ # Decode
+ decoded = video_tokenizer.decode(codes)
+ decoded_B, decoded_C, decoded_F, decoded_H, decoded_W = decoded.shape
+
+ # Verify decoded shape (allow small rounding errors for temporal dimension)
+ assert decoded_B == B, f"Decoded batch size mismatch: {decoded_B} != {B}"
+ assert decoded_C == C, f"Decoded channel mismatch: {decoded_C} != {C}"
+ assert decoded_H == H, f"Decoded height mismatch: {decoded_H} != {H}"
+ assert decoded_W == W, f"Decoded width mismatch: {decoded_W} != {W}"
+
+ # Frame count may differ slightly due to tokenizer's temporal interpolation/padding
+ # Allow ±1 frame tolerance
+ assert abs(decoded_F - F) <= 1, \
+ f"Decoded frame count mismatch: {decoded_F} vs {F} (allowed ±1 frame tolerance)"
+
+ print(f"Decoded video shape: {decoded.shape} [B, C, F, H, W]")
+ if decoded_F != F:
+ print(f" Note: Frame count differs by {decoded_F - F} (expected {F}, got {decoded_F})")
+ print(f" This is acceptable due to tokenizer's temporal interpolation behavior")
+
+ print(f"\n✓ All shape checks passed!")
+ print(f" Compression: {F}x{H}x{W} -> {F_prime}x{H_prime}x{W_prime} -> {decoded_F}x{H}x{W}")
+ print(f" Compression ratios: temporal={F_prime/F:.3f}, spatial={H_prime*W_prime/(H*W):.3f}")
+ return True
+
+ except Exception as e:
+ print(f"\n✗ CosmosVideoTokenizer shape test failed: {e}")
+ import traceback
+ traceback.print_exc()
+ return False
+
+
+def test_pipeline_forward_latent_only(pipe, device):
+ """
+ Test pipeline forward pass with latent-only output (no decoding).
+
+ This test verifies:
+ - Pipeline initialization and forward pass
+ - Shape consistency through the denoising loop
+ - Latent output format [B, F', H', W']
+ - Token value ranges [0, codebook_size]
+ """
+ print("\n" + "=" * 80)
+ print("[Test] Pipeline forward pass (latent-only output)")
+ print("=" * 80)
+
+ try:
+ prompt = ["a test prompt"]
+ num_frames = 8
+ height = 256
+ width = 448
+ num_inference_steps = 2
+
+ print(f"Test parameters:")
+ print(f" prompt: {prompt}")
+ print(f" num_frames: {num_frames}")
+ print(f" height: {height}, width: {width}")
+ print(f" num_inference_steps: {num_inference_steps}")
+ print(f" output_type: 'latent'")
+
+ # Run pipeline with latent output
+ result = pipe(
+ prompt=prompt,
+ num_frames=num_frames,
+ height=height,
+ width=width,
+ num_inference_steps=num_inference_steps,
+ output_type="latent",
+ return_dict=True,
+ )
+
+ # Verify output shape and type
+ output = result.videos
+ assert isinstance(output, torch.Tensor), f"Expected torch.Tensor, got {type(output)}"
+ assert output.dtype in (torch.long, torch.int64), \
+ f"Expected dtype torch.long or torch.int64, got {output.dtype}"
+
+ # Calculate expected compressed dimensions
+ F_prime = num_frames // pipe.video_tokenizer.t_downsample
+ H_prime = height // pipe.video_tokenizer.h_downsample
+ W_prime = width // pipe.video_tokenizer.w_downsample
+
+ expected_shape = (1, F_prime, H_prime, W_prime)
+ assert output.shape == expected_shape, \
+ f"Expected output shape {expected_shape}, got {output.shape}"
+
+ print(f"\n✓ Output shape verified: {output.shape} [B, F', H', W']")
+
+ # Check token value ranges
+ min_val = output.min().item()
+ max_val = output.max().item()
+ codebook_size = pipe.video_tokenizer.codebook_size
+
+ print(f"Token value range: [{min_val}, {max_val}]")
+ print(f"Codebook size: {codebook_size}")
+
+ # Tokens should be in [0, codebook_size] (codebook_size is mask_token_id)
+ assert min_val >= 0, f"Token values should be >= 0, got min={min_val}"
+ assert max_val <= codebook_size, \
+ f"Token values should be <= {codebook_size}, got max={max_val}"
+
+ # Print sample tokens
+ sample_tokens = output[0, 0, :5, :5].cpu().numpy()
+ print(f"\nSample tokens (first 5x5 of first frame):")
+ print(sample_tokens)
+
+ print(f"\n✓ All latent-only tests passed!")
+ return True
+
+ except Exception as e:
+ print(f"\n✗ Pipeline forward test failed: {e}")
+ import traceback
+ traceback.print_exc()
+ return False
+
+
+# if __name__ == "__main__":
+# """
+# Comprehensive test for the video diffusion pipeline.
+
+# Test sequence:
+# 1. test_cosmos_tokenizer_shapes() - Verify CosmosVideoTokenizer encode/decode
+# 2. Build all pipeline components
+# 3. test_pipeline_forward_latent_only() - Test pipeline forward pass with latent output
+# 4. Full pipeline test with PIL output (optional, after latent test passes)
+# """
+# import torch
+
+# # Set device and dtype
+# device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+# dtype = torch.float32
+
+# print("=" * 80)
+# print("Testing Video Diffusion Pipeline")
+# print("=" * 80)
+
+# # Step 1: Test CosmosVideoTokenizer shapes
+# if not test_cosmos_tokenizer_shapes():
+# print("\n✗ CosmosVideoTokenizer test failed. Exiting.")
+# exit(1)
+
+# # Step 2: Build pipeline components
+# print("\n" + "=" * 80)
+# print("Building Pipeline Components")
+# print("=" * 80)
+
+# try:
+# model_id = "Cosmos-1.0-Tokenizer-DV8x16x16"
+# print(f"\nLoading video_tokenizer: {model_id}")
+# video_tokenizer = CosmosVideoTokenizer(model_id=model_id, device=device, dtype=dtype)
+# print(f"✓ Video tokenizer loaded")
+
+# print(f"\nLoading T5 tokenizer and encoder...")
+# from transformers import T5Tokenizer, T5EncoderModel
+# tokenizer = T5Tokenizer.from_pretrained('google/umt5-base')
+# text_encoder = T5EncoderModel.from_pretrained('google/umt5-base').to(device, dtype=dtype)
+# print(f"✓ T5 components loaded")
+
+# print(f"\nInitializing Scheduler...")
+# from src.scheduler_video import Scheduler
+# # mask_token_id = codebook_size (outside valid code range [0, codebook_size-1])
+# scheduler = Scheduler(
+# mask_token_id=video_tokenizer.mask_token_id, # = codebook_size
+# masking_schedule="cosine"
+# )
+# print(f"✓ Scheduler initialized")
+# print(f" Scheduler mask_token_id: {scheduler.config.mask_token_id}")
+# print(f" Video tokenizer codebook_size: {video_tokenizer.codebook_size}")
+
+# print(f"\nLoading transformer...")
+# from src.transformer_video import WanDiscreteVideoTransformer
+
+# # Calculate compressed dimensions for transformer
+# num_frames = 8
+# height = 256
+# width = 448
+# F_prime = num_frames // video_tokenizer.t_downsample
+# H_prime = height // video_tokenizer.h_downsample
+# W_prime = width // video_tokenizer.w_downsample
+
+# # Get actual text encoder output dimension
+# # UMT5-base outputs 768 dimensions, not 4096
+# text_dim_actual = text_encoder.config.d_model
+# print(f" Text encoder output dimension: {text_dim_actual}")
+
+# transformer = WanDiscreteVideoTransformer(
+# codebook_size=video_tokenizer.codebook_size,
+# vocab_size=video_tokenizer.codebook_size + 1,
+# num_frames=F_prime,
+# height=H_prime,
+# width=W_prime,
+# model_type='t2v',
+# patch_size=(1, 2, 2),
+# text_len=512,
+# in_dim=16,
+# dim=2048,
+# ffn_dim=8192,
+# freq_dim=256,
+# text_dim=text_dim_actual, # Use actual text encoder dimension (768 for UMT5-base)
+# out_dim=16,
+# num_heads=16,
+# num_layers=32,
+# window_size=(-1, -1),
+# qk_norm=True,
+# cross_attn_norm=True,
+# eps=1e-6
+# ).to(device, dtype=dtype)
+# print(f"✓ Transformer initialized")
+
+# print(f"\nInitializing Pipeline...")
+# pipe = Pipeline(
+# tokenizer=tokenizer,
+# text_encoder=text_encoder,
+# transformer=transformer,
+# scheduler=scheduler,
+# video_tokenizer=video_tokenizer,
+# text_len=512,
+# num_frames=num_frames,
+# height=height,
+# width=width,
+# ).to(device)
+# print(f"✓ Pipeline initialized")
+
+# except Exception as e:
+# print(f"\n✗ Failed to build pipeline components: {e}")
+# import traceback
+# traceback.print_exc()
+# exit(1)
+
+# # Step 3: Test pipeline forward pass with latent-only output
+# if not test_pipeline_forward_latent_only(pipe, device):
+# print("\n✗ Pipeline forward test failed. Exiting.")
+# exit(1)
+
+# # Step 4: Optional - Full pipeline test with PIL output
+# # (Uncomment when latent-only test passes)
+# # print("\n" + "=" * 80)
+# # print("[Test] Full pipeline test with PIL output")
+# # print("=" * 80)
+# # try:
+# # result = pipe(
+# # prompt="a test video",
+# # num_frames=num_frames,
+# # height=height,
+# # width=width,
+# # num_inference_steps=2,
+# # output_type="pil",
+# # return_dict=True,
+# # )
+# # print(f"✓ Full pipeline test passed!")
+# # except Exception as e:
+# # print(f"✗ Full pipeline test failed: {e}")
+# # import traceback
+# # traceback.print_exc()
+
+# print("\n" + "=" * 80)
+# print("All tests passed successfully!")
+# print("=" * 80)
\ No newline at end of file
diff --git a/Meissonic/src/scheduler.py b/Meissonic/src/scheduler.py
new file mode 100644
index 0000000000000000000000000000000000000000..3d2fe4276351ffb0ec5883c59c4985a5ca9bd859
--- /dev/null
+++ b/Meissonic/src/scheduler.py
@@ -0,0 +1,175 @@
+# Copyright 2024 The HuggingFace Team and The MeissonFlow Team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import math
+from dataclasses import dataclass
+from typing import List, Optional, Tuple, Union
+
+import torch
+
+from diffusers.configuration_utils import ConfigMixin, register_to_config
+from diffusers.utils import BaseOutput
+from diffusers.schedulers.scheduling_utils import SchedulerMixin
+
+
+def gumbel_noise(t, generator=None):
+ device = generator.device if generator is not None else t.device
+ noise = torch.zeros_like(t, device=device).uniform_(0, 1, generator=generator).to(t.device)
+ return -torch.log((-torch.log(noise.clamp(1e-20))).clamp(1e-20))
+
+
+def mask_by_random_topk(mask_len, probs, temperature=1.0, generator=None):
+ confidence = torch.log(probs.clamp(1e-20)) + temperature * gumbel_noise(probs, generator=generator)
+ sorted_confidence = torch.sort(confidence, dim=-1).values
+ cut_off = torch.gather(sorted_confidence, 1, mask_len.long())
+ masking = confidence < cut_off
+ return masking
+
+
+@dataclass
+class SchedulerOutput(BaseOutput):
+ """
+ Output class for the scheduler's `step` function output.
+
+ Args:
+ prev_sample (`torch.Tensor` of shape `(batch_size, num_channels, height, width)` for images):
+ Computed sample `(x_{t-1})` of previous timestep. `prev_sample` should be used as next model input in the
+ denoising loop.
+ pred_original_sample (`torch.Tensor` of shape `(batch_size, num_channels, height, width)` for images):
+ The predicted denoised sample `(x_{0})` based on the model output from the current timestep.
+ `pred_original_sample` can be used to preview progress or for guidance.
+ """
+
+ prev_sample: torch.Tensor
+ pred_original_sample: torch.Tensor = None
+
+
+class Scheduler(SchedulerMixin, ConfigMixin):
+ order = 1
+
+ temperatures: torch.Tensor
+
+ @register_to_config
+ def __init__(
+ self,
+ mask_token_id: int,
+ masking_schedule: str = "cosine",
+ ):
+ self.temperatures = None
+ self.timesteps = None
+
+ def set_timesteps(
+ self,
+ num_inference_steps: int,
+ temperature: Union[int, Tuple[int, int], List[int]] = (2, 0),
+ device: Union[str, torch.device] = None,
+ ):
+ self.timesteps = torch.arange(num_inference_steps, device=device).flip(0)
+
+ if isinstance(temperature, (tuple, list)):
+ self.temperatures = torch.linspace(temperature[0], temperature[1], num_inference_steps, device=device)
+ else:
+ self.temperatures = torch.linspace(temperature, 0.01, num_inference_steps, device=device)
+
+ def step(
+ self,
+ model_output: torch.Tensor,
+ timestep: torch.long,
+ sample: torch.LongTensor,
+ starting_mask_ratio: int = 1,
+ generator: Optional[torch.Generator] = None,
+ return_dict: bool = True,
+ ) -> Union[SchedulerOutput, Tuple]:
+ two_dim_input = sample.ndim == 3 and model_output.ndim == 4
+
+ if two_dim_input:
+ batch_size, codebook_size, height, width = model_output.shape
+ sample = sample.reshape(batch_size, height * width)
+ model_output = model_output.reshape(batch_size, codebook_size, height * width).permute(0, 2, 1)
+
+ unknown_map = sample == self.config.mask_token_id
+
+ probs = model_output.softmax(dim=-1)
+
+ device = probs.device
+ probs_ = probs.to(generator.device) if generator is not None else probs # handles when generator is on CPU
+ if probs_.device.type == "cpu" and probs_.dtype != torch.float32:
+ probs_ = probs_.float() # multinomial is not implemented for cpu half precision
+ probs_ = probs_.reshape(-1, probs.size(-1))
+ pred_original_sample = torch.multinomial(probs_, 1, generator=generator).to(device=device)
+ pred_original_sample = pred_original_sample[:, 0].view(*probs.shape[:-1])
+ pred_original_sample = torch.where(unknown_map, pred_original_sample, sample)
+
+ if timestep == 0:
+ prev_sample = pred_original_sample
+ else:
+ seq_len = sample.shape[1]
+ step_idx = (self.timesteps == timestep).nonzero()
+ ratio = (step_idx + 1) / len(self.timesteps)
+
+ if self.config.masking_schedule == "cosine":
+ mask_ratio = torch.cos(ratio * math.pi / 2)
+ elif self.config.masking_schedule == "linear":
+ mask_ratio = 1 - ratio
+ else:
+ raise ValueError(f"unknown masking schedule {self.config.masking_schedule}")
+
+ mask_ratio = starting_mask_ratio * mask_ratio
+
+ mask_len = (seq_len * mask_ratio).floor()
+ # do not mask more than amount previously masked
+ mask_len = torch.min(unknown_map.sum(dim=-1, keepdim=True) - 1, mask_len)
+ # mask at least one
+ mask_len = torch.max(torch.tensor([1], device=model_output.device), mask_len)
+
+ selected_probs = torch.gather(probs, -1, pred_original_sample[:, :, None])[:, :, 0]
+ # Ignores the tokens given in the input by overwriting their confidence.
+ selected_probs = torch.where(unknown_map, selected_probs, torch.finfo(selected_probs.dtype).max)
+
+ masking = mask_by_random_topk(mask_len, selected_probs, self.temperatures[step_idx], generator)
+
+ # Masks tokens with lower confidence.
+ prev_sample = torch.where(masking, self.config.mask_token_id, pred_original_sample)
+
+ if two_dim_input:
+ prev_sample = prev_sample.reshape(batch_size, height, width)
+ pred_original_sample = pred_original_sample.reshape(batch_size, height, width)
+
+ if not return_dict:
+ return (prev_sample, pred_original_sample)
+
+ return SchedulerOutput(prev_sample, pred_original_sample)
+
+ def add_noise(self, sample, timesteps, generator=None):
+ step_idx = (self.timesteps == timesteps).nonzero()
+ ratio = (step_idx + 1) / len(self.timesteps)
+
+ if self.config.masking_schedule == "cosine":
+ mask_ratio = torch.cos(ratio * math.pi / 2)
+ elif self.config.masking_schedule == "linear":
+ mask_ratio = 1 - ratio
+ else:
+ raise ValueError(f"unknown masking schedule {self.config.masking_schedule}")
+
+ mask_indices = (
+ torch.rand(
+ sample.shape, device=generator.device if generator is not None else sample.device, generator=generator
+ ).to(sample.device)
+ < mask_ratio
+ )
+
+ masked_sample = sample.clone()
+
+ masked_sample[mask_indices] = self.config.mask_token_id
+
+ return masked_sample
diff --git a/Meissonic/src/scheduler_video.py b/Meissonic/src/scheduler_video.py
new file mode 100644
index 0000000000000000000000000000000000000000..a3e02c7d9fe23349b1c1deca7d85018a9467713f
--- /dev/null
+++ b/Meissonic/src/scheduler_video.py
@@ -0,0 +1,188 @@
+# Copyright 2024 The HuggingFace Team and The MeissonFlow Team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import math
+from dataclasses import dataclass
+from typing import List, Optional, Tuple, Union
+
+import torch
+
+from diffusers.configuration_utils import ConfigMixin, register_to_config
+from diffusers.utils import BaseOutput
+from diffusers.schedulers.scheduling_utils import SchedulerMixin
+
+
+def gumbel_noise(t, generator=None):
+ device = generator.device if generator is not None else t.device
+ noise = torch.zeros_like(t, device=device).uniform_(0, 1, generator=generator).to(t.device)
+ return -torch.log((-torch.log(noise.clamp(1e-20))).clamp(1e-20))
+
+
+def mask_by_random_topk(mask_len, probs, temperature=1.0, generator=None):
+ confidence = torch.log(probs.clamp(1e-20)) + temperature * gumbel_noise(probs, generator=generator)
+ sorted_confidence = torch.sort(confidence, dim=-1).values
+ cut_off = torch.gather(sorted_confidence, 1, mask_len.long())
+ masking = confidence < cut_off
+ return masking
+
+
+@dataclass
+class SchedulerOutput(BaseOutput):
+ """
+ Output class for the scheduler's `step` function output.
+
+ Args:
+ prev_sample (`torch.Tensor` of shape `(batch_size, num_channels, height, width)` for images):
+ Computed sample `(x_{t-1})` of previous timestep. `prev_sample` should be used as next model input in the
+ denoising loop.
+ pred_original_sample (`torch.Tensor` of shape `(batch_size, num_channels, height, width)` for images):
+ The predicted denoised sample `(x_{0})` based on the model output from the current timestep.
+ `pred_original_sample` can be used to preview progress or for guidance.
+ """
+
+ prev_sample: torch.Tensor
+ pred_original_sample: torch.Tensor = None
+
+
+class Scheduler(SchedulerMixin, ConfigMixin):
+ order = 1
+
+ temperatures: torch.Tensor
+
+ @register_to_config
+ def __init__(
+ self,
+ mask_token_id: int,
+ masking_schedule: str = "cosine",
+ ):
+ self.temperatures = None
+ self.timesteps = None
+
+ def set_timesteps(
+ self,
+ num_inference_steps: int,
+ temperature: Union[int, Tuple[int, int], List[int]] = (2, 0),
+ device: Union[str, torch.device] = None,
+ ):
+ self.timesteps = torch.arange(num_inference_steps, device=device).flip(0)
+
+ if isinstance(temperature, (tuple, list)):
+ self.temperatures = torch.linspace(temperature[0], temperature[1], num_inference_steps, device=device)
+ else:
+ self.temperatures = torch.linspace(temperature, 0.01, num_inference_steps, device=device)
+
+ def step(
+ self,
+ model_output: torch.Tensor,
+ timestep: torch.long,
+ sample: torch.LongTensor,
+ starting_mask_ratio: int = 1,
+ generator: Optional[torch.Generator] = None,
+ return_dict: bool = True,
+ ) -> Union[SchedulerOutput, Tuple]:
+ # Handle different input shapes: 1D, 2D (image), or 3D (video)
+ # All are flattened to 1D token sequences [B, N] for unified processing
+ two_dim_input = sample.ndim == 3 and model_output.ndim == 4 # [B, H, W] & [B, vocab, H, W]
+ three_dim_input = sample.ndim == 4 and model_output.ndim == 5 # [B, F, H, W] & [B, vocab, F, H, W]
+
+ if two_dim_input:
+ # Image case: [B, H, W] -> [B, H*W]
+ batch_size, vocab_size, height, width = model_output.shape
+ sample = sample.reshape(batch_size, height * width)
+ model_output = model_output.reshape(batch_size, vocab_size, height * width).permute(0, 2, 1)
+ elif three_dim_input:
+ # Video case: [B, F, H, W] -> [B, F*H*W]
+ batch_size, vocab_size, num_frames, height, width = model_output.shape
+ sample = sample.reshape(batch_size, num_frames * height * width)
+ model_output = model_output.reshape(batch_size, vocab_size, num_frames * height * width).permute(0, 2, 1)
+
+ unknown_map = sample == self.config.mask_token_id
+
+ probs = model_output.softmax(dim=-1)
+
+ device = probs.device
+ probs_ = probs.to(generator.device) if generator is not None else probs # handles when generator is on CPU
+ if probs_.device.type == "cpu" and probs_.dtype != torch.float32:
+ probs_ = probs_.float() # multinomial is not implemented for cpu half precision
+ probs_ = probs_.reshape(-1, probs.size(-1))
+ pred_original_sample = torch.multinomial(probs_, 1, generator=generator).to(device=device)
+ pred_original_sample = pred_original_sample[:, 0].view(*probs.shape[:-1])
+ pred_original_sample = torch.where(unknown_map, pred_original_sample, sample)
+
+ if timestep == 0:
+ prev_sample = pred_original_sample
+ else:
+ seq_len = sample.shape[1]
+ step_idx = (self.timesteps == timestep).nonzero()
+ ratio = (step_idx + 1) / len(self.timesteps)
+
+ if self.config.masking_schedule == "cosine":
+ mask_ratio = torch.cos(ratio * math.pi / 2)
+ elif self.config.masking_schedule == "linear":
+ mask_ratio = 1 - ratio
+ else:
+ raise ValueError(f"unknown masking schedule {self.config.masking_schedule}")
+
+ mask_ratio = starting_mask_ratio * mask_ratio
+
+ mask_len = (seq_len * mask_ratio).floor()
+ # do not mask more than amount previously masked
+ mask_len = torch.min(unknown_map.sum(dim=-1, keepdim=True) - 1, mask_len)
+ # mask at least one
+ mask_len = torch.max(torch.tensor([1], device=model_output.device), mask_len)
+
+ selected_probs = torch.gather(probs, -1, pred_original_sample[:, :, None])[:, :, 0]
+ # Ignores the tokens given in the input by overwriting their confidence.
+ selected_probs = torch.where(unknown_map, selected_probs, torch.finfo(selected_probs.dtype).max)
+
+ masking = mask_by_random_topk(mask_len, selected_probs, self.temperatures[step_idx], generator)
+
+ # Masks tokens with lower confidence.
+ prev_sample = torch.where(masking, self.config.mask_token_id, pred_original_sample)
+
+ # Reshape back to original dimensions
+ if two_dim_input:
+ prev_sample = prev_sample.reshape(batch_size, height, width)
+ pred_original_sample = pred_original_sample.reshape(batch_size, height, width)
+ elif three_dim_input:
+ prev_sample = prev_sample.reshape(batch_size, num_frames, height, width)
+ pred_original_sample = pred_original_sample.reshape(batch_size, num_frames, height, width)
+
+ if not return_dict:
+ return (prev_sample, pred_original_sample)
+
+ return SchedulerOutput(prev_sample, pred_original_sample)
+
+ def add_noise(self, sample, timesteps, generator=None):
+ step_idx = (self.timesteps == timesteps).nonzero()
+ ratio = (step_idx + 1) / len(self.timesteps)
+
+ if self.config.masking_schedule == "cosine":
+ mask_ratio = torch.cos(ratio * math.pi / 2)
+ elif self.config.masking_schedule == "linear":
+ mask_ratio = 1 - ratio
+ else:
+ raise ValueError(f"unknown masking schedule {self.config.masking_schedule}")
+
+ mask_indices = (
+ torch.rand(
+ sample.shape, device=generator.device if generator is not None else sample.device, generator=generator
+ ).to(sample.device)
+ < mask_ratio
+ )
+
+ masked_sample = sample.clone()
+
+ masked_sample[mask_indices] = self.config.mask_token_id
+
+ return masked_sample
diff --git a/Meissonic/src/transformer.py b/Meissonic/src/transformer.py
new file mode 100644
index 0000000000000000000000000000000000000000..fbce4e73ae35ea5f3e21084456bef097b3f76c83
--- /dev/null
+++ b/Meissonic/src/transformer.py
@@ -0,0 +1,1116 @@
+# Copyright 2024 Black Forest Labs, The HuggingFace Team, The InstantX Team and The MeissonFlow Team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+from typing import Any, Dict, Optional, Tuple, Union
+
+import numpy as np
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+
+from diffusers.configuration_utils import ConfigMixin, register_to_config
+from diffusers.loaders import FromOriginalModelMixin, PeftAdapterMixin
+from diffusers.models.attention import FeedForward, BasicTransformerBlock, SkipFFTransformerBlock
+from diffusers.models.attention_processor import (
+ Attention,
+ AttentionProcessor,
+ FluxAttnProcessor2_0,
+ # FusedFluxAttnProcessor2_0,
+)
+from diffusers.models.modeling_utils import ModelMixin
+from diffusers.models.normalization import AdaLayerNormContinuous, AdaLayerNormZero, AdaLayerNormZeroSingle, GlobalResponseNorm, RMSNorm
+from diffusers.utils import USE_PEFT_BACKEND, is_torch_version, logging, scale_lora_layers, unscale_lora_layers
+from diffusers.utils.torch_utils import maybe_allow_in_graph
+from diffusers.models.embeddings import CombinedTimestepGuidanceTextProjEmbeddings, CombinedTimestepTextProjEmbeddings,TimestepEmbedding, get_timestep_embedding #,FluxPosEmbed
+from diffusers.models.modeling_outputs import Transformer2DModelOutput
+from diffusers.models.resnet import Downsample2D, Upsample2D
+
+from typing import List
+
+logger = logging.get_logger(__name__) # pylint: disable=invalid-name
+
+
+
+def get_3d_rotary_pos_embed(
+ embed_dim, crops_coords, grid_size, temporal_size, theta: int = 10000, use_real: bool = True
+) -> Union[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]]:
+ """
+ RoPE for video tokens with 3D structure.
+
+ Args:
+ embed_dim: (`int`):
+ The embedding dimension size, corresponding to hidden_size_head.
+ crops_coords (`Tuple[int]`):
+ The top-left and bottom-right coordinates of the crop.
+ grid_size (`Tuple[int]`):
+ The grid size of the spatial positional embedding (height, width).
+ temporal_size (`int`):
+ The size of the temporal dimension.
+ theta (`float`):
+ Scaling factor for frequency computation.
+ use_real (`bool`):
+ If True, return real part and imaginary part separately. Otherwise, return complex numbers.
+
+ Returns:
+ `torch.Tensor`: positional embedding with shape `(temporal_size * grid_size[0] * grid_size[1], embed_dim/2)`.
+ """
+ start, stop = crops_coords
+ grid_h = np.linspace(start[0], stop[0], grid_size[0], endpoint=False, dtype=np.float32)
+ grid_w = np.linspace(start[1], stop[1], grid_size[1], endpoint=False, dtype=np.float32)
+ grid_t = np.linspace(0, temporal_size, temporal_size, endpoint=False, dtype=np.float32)
+
+ # Compute dimensions for each axis
+ dim_t = embed_dim // 4
+ dim_h = embed_dim // 8 * 3
+ dim_w = embed_dim // 8 * 3
+
+ # Temporal frequencies
+ freqs_t = 1.0 / (theta ** (torch.arange(0, dim_t, 2).float() / dim_t))
+ grid_t = torch.from_numpy(grid_t).float()
+ freqs_t = torch.einsum("n , f -> n f", grid_t, freqs_t)
+ freqs_t = freqs_t.repeat_interleave(2, dim=-1)
+
+ # Spatial frequencies for height and width
+ freqs_h = 1.0 / (theta ** (torch.arange(0, dim_h, 2).float() / dim_h))
+ freqs_w = 1.0 / (theta ** (torch.arange(0, dim_w, 2).float() / dim_w))
+ grid_h = torch.from_numpy(grid_h).float()
+ grid_w = torch.from_numpy(grid_w).float()
+ freqs_h = torch.einsum("n , f -> n f", grid_h, freqs_h)
+ freqs_w = torch.einsum("n , f -> n f", grid_w, freqs_w)
+ freqs_h = freqs_h.repeat_interleave(2, dim=-1)
+ freqs_w = freqs_w.repeat_interleave(2, dim=-1)
+
+ # Broadcast and concatenate tensors along specified dimension
+ def broadcast(tensors, dim=-1):
+ num_tensors = len(tensors)
+ shape_lens = {len(t.shape) for t in tensors}
+ assert len(shape_lens) == 1, "tensors must all have the same number of dimensions"
+ shape_len = list(shape_lens)[0]
+ dim = (dim + shape_len) if dim < 0 else dim
+ dims = list(zip(*(list(t.shape) for t in tensors)))
+ expandable_dims = [(i, val) for i, val in enumerate(dims) if i != dim]
+ assert all(
+ [*(len(set(t[1])) <= 2 for t in expandable_dims)]
+ ), "invalid dimensions for broadcastable concatenation"
+ max_dims = [(t[0], max(t[1])) for t in expandable_dims]
+ expanded_dims = [(t[0], (t[1],) * num_tensors) for t in max_dims]
+ expanded_dims.insert(dim, (dim, dims[dim]))
+ expandable_shapes = list(zip(*(t[1] for t in expanded_dims)))
+ tensors = [t[0].expand(*t[1]) for t in zip(tensors, expandable_shapes)]
+ return torch.cat(tensors, dim=dim)
+
+ freqs = broadcast((freqs_t[:, None, None, :], freqs_h[None, :, None, :], freqs_w[None, None, :, :]), dim=-1)
+
+ t, h, w, d = freqs.shape
+ freqs = freqs.view(t * h * w, d)
+
+ # Generate sine and cosine components
+ sin = freqs.sin()
+ cos = freqs.cos()
+
+ if use_real:
+ return cos, sin
+ else:
+ freqs_cis = torch.polar(torch.ones_like(freqs), freqs)
+ return freqs_cis
+
+
+def get_2d_rotary_pos_embed(embed_dim, crops_coords, grid_size, use_real=True):
+ """
+ RoPE for image tokens with 2d structure.
+
+ Args:
+ embed_dim: (`int`):
+ The embedding dimension size
+ crops_coords (`Tuple[int]`)
+ The top-left and bottom-right coordinates of the crop.
+ grid_size (`Tuple[int]`):
+ The grid size of the positional embedding.
+ use_real (`bool`):
+ If True, return real part and imaginary part separately. Otherwise, return complex numbers.
+
+ Returns:
+ `torch.Tensor`: positional embedding with shape `( grid_size * grid_size, embed_dim/2)`.
+ """
+ start, stop = crops_coords
+ grid_h = np.linspace(start[0], stop[0], grid_size[0], endpoint=False, dtype=np.float32)
+ grid_w = np.linspace(start[1], stop[1], grid_size[1], endpoint=False, dtype=np.float32)
+ grid = np.meshgrid(grid_w, grid_h) # here w goes first
+ grid = np.stack(grid, axis=0) # [2, W, H]
+
+ grid = grid.reshape([2, 1, *grid.shape[1:]])
+ pos_embed = get_2d_rotary_pos_embed_from_grid(embed_dim, grid, use_real=use_real)
+ return pos_embed
+
+
+def get_2d_rotary_pos_embed_from_grid(embed_dim, grid, use_real=False):
+ assert embed_dim % 4 == 0
+
+ # use half of dimensions to encode grid_h
+ emb_h = get_1d_rotary_pos_embed(
+ embed_dim // 2, grid[0].reshape(-1), use_real=use_real
+ ) # (H*W, D/2) if use_real else (H*W, D/4)
+ emb_w = get_1d_rotary_pos_embed(
+ embed_dim // 2, grid[1].reshape(-1), use_real=use_real
+ ) # (H*W, D/2) if use_real else (H*W, D/4)
+
+ if use_real:
+ cos = torch.cat([emb_h[0], emb_w[0]], dim=1) # (H*W, D)
+ sin = torch.cat([emb_h[1], emb_w[1]], dim=1) # (H*W, D)
+ return cos, sin
+ else:
+ emb = torch.cat([emb_h, emb_w], dim=1) # (H*W, D/2)
+ return emb
+
+
+def get_2d_rotary_pos_embed_lumina(embed_dim, len_h, len_w, linear_factor=1.0, ntk_factor=1.0):
+ assert embed_dim % 4 == 0
+
+ emb_h = get_1d_rotary_pos_embed(
+ embed_dim // 2, len_h, linear_factor=linear_factor, ntk_factor=ntk_factor
+ ) # (H, D/4)
+ emb_w = get_1d_rotary_pos_embed(
+ embed_dim // 2, len_w, linear_factor=linear_factor, ntk_factor=ntk_factor
+ ) # (W, D/4)
+ emb_h = emb_h.view(len_h, 1, embed_dim // 4, 1).repeat(1, len_w, 1, 1) # (H, W, D/4, 1)
+ emb_w = emb_w.view(1, len_w, embed_dim // 4, 1).repeat(len_h, 1, 1, 1) # (H, W, D/4, 1)
+
+ emb = torch.cat([emb_h, emb_w], dim=-1).flatten(2) # (H, W, D/2)
+ return emb
+
+
+def get_1d_rotary_pos_embed(
+ dim: int,
+ pos: Union[np.ndarray, int],
+ theta: float = 10000.0,
+ use_real=False,
+ linear_factor=1.0,
+ ntk_factor=1.0,
+ repeat_interleave_real=True,
+ freqs_dtype=torch.float32, # torch.float32 (hunyuan, stable audio), torch.float64 (flux)
+):
+ """
+ Precompute the frequency tensor for complex exponentials (cis) with given dimensions.
+
+ This function calculates a frequency tensor with complex exponentials using the given dimension 'dim' and the end
+ index 'end'. The 'theta' parameter scales the frequencies. The returned tensor contains complex values in complex64
+ data type.
+
+ Args:
+ dim (`int`): Dimension of the frequency tensor.
+ pos (`np.ndarray` or `int`): Position indices for the frequency tensor. [S] or scalar
+ theta (`float`, *optional*, defaults to 10000.0):
+ Scaling factor for frequency computation. Defaults to 10000.0.
+ use_real (`bool`, *optional*):
+ If True, return real part and imaginary part separately. Otherwise, return complex numbers.
+ linear_factor (`float`, *optional*, defaults to 1.0):
+ Scaling factor for the context extrapolation. Defaults to 1.0.
+ ntk_factor (`float`, *optional*, defaults to 1.0):
+ Scaling factor for the NTK-Aware RoPE. Defaults to 1.0.
+ repeat_interleave_real (`bool`, *optional*, defaults to `True`):
+ If `True` and `use_real`, real part and imaginary part are each interleaved with themselves to reach `dim`.
+ Otherwise, they are concateanted with themselves.
+ freqs_dtype (`torch.float32` or `torch.float64`, *optional*, defaults to `torch.float32`):
+ the dtype of the frequency tensor.
+ Returns:
+ `torch.Tensor`: Precomputed frequency tensor with complex exponentials. [S, D/2]
+ """
+ assert dim % 2 == 0
+
+ if isinstance(pos, int):
+ pos = np.arange(pos)
+ theta = theta * ntk_factor
+ freqs = 1.0 / (theta ** (torch.arange(0, dim, 2, dtype=freqs_dtype)[: (dim // 2)] / dim)) / linear_factor # [D/2]
+ t = torch.from_numpy(pos).to(freqs.device) # type: ignore # [S]
+ freqs = torch.outer(t, freqs) # type: ignore # [S, D/2]
+ if use_real and repeat_interleave_real:
+ freqs_cos = freqs.cos().repeat_interleave(2, dim=1).float() # [S, D]
+ freqs_sin = freqs.sin().repeat_interleave(2, dim=1).float() # [S, D]
+ return freqs_cos, freqs_sin
+ elif use_real:
+ freqs_cos = torch.cat([freqs.cos(), freqs.cos()], dim=-1).float() # [S, D]
+ freqs_sin = torch.cat([freqs.sin(), freqs.sin()], dim=-1).float() # [S, D]
+ return freqs_cos, freqs_sin
+ else:
+ freqs_cis = torch.polar(torch.ones_like(freqs), freqs).float() # complex64 # [S, D/2]
+ return freqs_cis
+
+
+class FluxPosEmbed(nn.Module):
+ # modified from https://github.com/black-forest-labs/flux/blob/c00d7c60b085fce8058b9df845e036090873f2ce/src/flux/modules/layers.py#L11
+ def __init__(self, theta: int, axes_dim: List[int]):
+ super().__init__()
+ self.theta = theta
+ self.axes_dim = axes_dim
+
+ def forward(self, ids: torch.Tensor) -> torch.Tensor:
+ n_axes = ids.shape[-1]
+ cos_out = []
+ sin_out = []
+ pos = ids.squeeze().float().cpu().numpy()
+ is_mps = ids.device.type == "mps"
+ freqs_dtype = torch.float32 if is_mps else torch.float64
+ for i in range(n_axes):
+ cos, sin = get_1d_rotary_pos_embed(
+ self.axes_dim[i], pos[:, i], repeat_interleave_real=True, use_real=True, freqs_dtype=freqs_dtype
+ )
+ cos_out.append(cos)
+ sin_out.append(sin)
+ freqs_cos = torch.cat(cos_out, dim=-1).to(ids.device)
+ freqs_sin = torch.cat(sin_out, dim=-1).to(ids.device)
+ return freqs_cos, freqs_sin
+
+
+
+class FusedFluxAttnProcessor2_0:
+ """Attention processor used typically in processing the SD3-like self-attention projections."""
+
+ def __init__(self):
+ if not hasattr(F, "scaled_dot_product_attention"):
+ raise ImportError(
+ "FusedFluxAttnProcessor2_0 requires PyTorch 2.0, to use it, please upgrade PyTorch to 2.0."
+ )
+
+ def __call__(
+ self,
+ attn: Attention,
+ hidden_states: torch.FloatTensor,
+ encoder_hidden_states: torch.FloatTensor = None,
+ attention_mask: Optional[torch.FloatTensor] = None,
+ image_rotary_emb: Optional[torch.Tensor] = None,
+ ) -> torch.FloatTensor:
+ batch_size, _, _ = hidden_states.shape if encoder_hidden_states is None else encoder_hidden_states.shape
+
+ # `sample` projections.
+ qkv = attn.to_qkv(hidden_states)
+ split_size = qkv.shape[-1] // 3
+ query, key, value = torch.split(qkv, split_size, dim=-1)
+
+ inner_dim = key.shape[-1]
+ head_dim = inner_dim // attn.heads
+
+ query = query.view(batch_size, -1, attn.heads, head_dim).transpose(1, 2)
+ key = key.view(batch_size, -1, attn.heads, head_dim).transpose(1, 2)
+ value = value.view(batch_size, -1, attn.heads, head_dim).transpose(1, 2)
+
+ if attn.norm_q is not None:
+ query = attn.norm_q(query)
+ if attn.norm_k is not None:
+ key = attn.norm_k(key)
+
+ # the attention in FluxSingleTransformerBlock does not use `encoder_hidden_states`
+ # `context` projections.
+ if encoder_hidden_states is not None:
+ encoder_qkv = attn.to_added_qkv(encoder_hidden_states)
+ split_size = encoder_qkv.shape[-1] // 3
+ (
+ encoder_hidden_states_query_proj,
+ encoder_hidden_states_key_proj,
+ encoder_hidden_states_value_proj,
+ ) = torch.split(encoder_qkv, split_size, dim=-1)
+
+ encoder_hidden_states_query_proj = encoder_hidden_states_query_proj.view(
+ batch_size, -1, attn.heads, head_dim
+ ).transpose(1, 2)
+ encoder_hidden_states_key_proj = encoder_hidden_states_key_proj.view(
+ batch_size, -1, attn.heads, head_dim
+ ).transpose(1, 2)
+ encoder_hidden_states_value_proj = encoder_hidden_states_value_proj.view(
+ batch_size, -1, attn.heads, head_dim
+ ).transpose(1, 2)
+
+ if attn.norm_added_q is not None:
+ encoder_hidden_states_query_proj = attn.norm_added_q(encoder_hidden_states_query_proj)
+ if attn.norm_added_k is not None:
+ encoder_hidden_states_key_proj = attn.norm_added_k(encoder_hidden_states_key_proj)
+
+ # attention
+ query = torch.cat([encoder_hidden_states_query_proj, query], dim=2)
+ key = torch.cat([encoder_hidden_states_key_proj, key], dim=2)
+ value = torch.cat([encoder_hidden_states_value_proj, value], dim=2)
+
+ if image_rotary_emb is not None:
+ from diffusers.models.embeddings import apply_rotary_emb
+
+ query = apply_rotary_emb(query, image_rotary_emb)
+ key = apply_rotary_emb(key, image_rotary_emb)
+
+ hidden_states = F.scaled_dot_product_attention(query, key, value, dropout_p=0.0, is_causal=False)
+ hidden_states = hidden_states.transpose(1, 2).reshape(batch_size, -1, attn.heads * head_dim)
+ hidden_states = hidden_states.to(query.dtype)
+
+ if encoder_hidden_states is not None:
+ encoder_hidden_states, hidden_states = (
+ hidden_states[:, : encoder_hidden_states.shape[1]],
+ hidden_states[:, encoder_hidden_states.shape[1] :],
+ )
+
+ # linear proj
+ hidden_states = attn.to_out[0](hidden_states)
+ # dropout
+ hidden_states = attn.to_out[1](hidden_states)
+ encoder_hidden_states = attn.to_add_out(encoder_hidden_states)
+
+ return hidden_states, encoder_hidden_states
+ else:
+ return hidden_states
+
+
+
+@maybe_allow_in_graph
+class SingleTransformerBlock(nn.Module):
+ r"""
+ A Transformer block following the MMDiT architecture, introduced in Stable Diffusion 3.
+
+ Reference: https://arxiv.org/abs/2403.03206
+
+ Parameters:
+ dim (`int`): The number of channels in the input and output.
+ num_attention_heads (`int`): The number of heads to use for multi-head attention.
+ attention_head_dim (`int`): The number of channels in each head.
+ context_pre_only (`bool`): Boolean to determine if we should add some blocks associated with the
+ processing of `context` conditions.
+ """
+
+ def __init__(self, dim, num_attention_heads, attention_head_dim, mlp_ratio=4.0):
+ super().__init__()
+ self.mlp_hidden_dim = int(dim * mlp_ratio)
+
+ self.norm = AdaLayerNormZeroSingle(dim)
+ self.proj_mlp = nn.Linear(dim, self.mlp_hidden_dim)
+ self.act_mlp = nn.GELU(approximate="tanh")
+ self.proj_out = nn.Linear(dim + self.mlp_hidden_dim, dim)
+
+ processor = FluxAttnProcessor2_0()
+ self.attn = Attention(
+ query_dim=dim,
+ cross_attention_dim=None,
+ dim_head=attention_head_dim,
+ heads=num_attention_heads,
+ out_dim=dim,
+ bias=True,
+ processor=processor,
+ qk_norm="rms_norm",
+ eps=1e-6,
+ pre_only=True,
+ )
+
+ def forward(
+ self,
+ hidden_states: torch.FloatTensor,
+ temb: torch.FloatTensor,
+ image_rotary_emb=None,
+ ):
+ residual = hidden_states
+ norm_hidden_states, gate = self.norm(hidden_states, emb=temb)
+ mlp_hidden_states = self.act_mlp(self.proj_mlp(norm_hidden_states))
+
+ attn_output = self.attn(
+ hidden_states=norm_hidden_states,
+ image_rotary_emb=image_rotary_emb,
+ )
+
+ hidden_states = torch.cat([attn_output, mlp_hidden_states], dim=2)
+ gate = gate.unsqueeze(1)
+ hidden_states = gate * self.proj_out(hidden_states)
+ hidden_states = residual + hidden_states
+ if hidden_states.dtype == torch.float16:
+ hidden_states = hidden_states.clip(-65504, 65504)
+
+ return hidden_states
+
+@maybe_allow_in_graph
+class TransformerBlock(nn.Module):
+ r"""
+ A Transformer block following the MMDiT architecture, introduced in Stable Diffusion 3.
+
+ Reference: https://arxiv.org/abs/2403.03206
+
+ Parameters:
+ dim (`int`): The number of channels in the input and output.
+ num_attention_heads (`int`): The number of heads to use for multi-head attention.
+ attention_head_dim (`int`): The number of channels in each head.
+ context_pre_only (`bool`): Boolean to determine if we should add some blocks associated with the
+ processing of `context` conditions.
+ """
+
+ def __init__(self, dim, num_attention_heads, attention_head_dim, qk_norm="rms_norm", eps=1e-6):
+ super().__init__()
+
+ self.norm1 = AdaLayerNormZero(dim)
+
+ self.norm1_context = AdaLayerNormZero(dim)
+
+ if hasattr(F, "scaled_dot_product_attention"):
+ processor = FluxAttnProcessor2_0()
+ else:
+ raise ValueError(
+ "The current PyTorch version does not support the `scaled_dot_product_attention` function."
+ )
+ self.attn = Attention(
+ query_dim=dim,
+ cross_attention_dim=None,
+ added_kv_proj_dim=dim,
+ dim_head=attention_head_dim,
+ heads=num_attention_heads,
+ out_dim=dim,
+ context_pre_only=False,
+ bias=True,
+ processor=processor,
+ qk_norm=qk_norm,
+ eps=eps,
+ )
+
+ self.norm2 = nn.LayerNorm(dim, elementwise_affine=False, eps=1e-6)
+ self.ff = FeedForward(dim=dim, dim_out=dim, activation_fn="gelu-approximate")
+ # self.ff = FeedForward(dim=dim, dim_out=dim, activation_fn="swiglu")
+
+ self.norm2_context = nn.LayerNorm(dim, elementwise_affine=False, eps=1e-6)
+ self.ff_context = FeedForward(dim=dim, dim_out=dim, activation_fn="gelu-approximate")
+ # self.ff_context = FeedForward(dim=dim, dim_out=dim, activation_fn="swiglu")
+
+ # let chunk size default to None
+ self._chunk_size = None
+ self._chunk_dim = 0
+
+ def forward(
+ self,
+ hidden_states: torch.FloatTensor,
+ encoder_hidden_states: torch.FloatTensor,
+ temb: torch.FloatTensor,
+ image_rotary_emb=None,
+ ):
+ norm_hidden_states, gate_msa, shift_mlp, scale_mlp, gate_mlp = self.norm1(hidden_states, emb=temb)
+
+ norm_encoder_hidden_states, c_gate_msa, c_shift_mlp, c_scale_mlp, c_gate_mlp = self.norm1_context(
+ encoder_hidden_states, emb=temb
+ )
+ # Attention.
+ attn_output, context_attn_output = self.attn(
+ hidden_states=norm_hidden_states,
+ encoder_hidden_states=norm_encoder_hidden_states,
+ image_rotary_emb=image_rotary_emb,
+ )
+
+ # Process attention outputs for the `hidden_states`.
+ attn_output = gate_msa.unsqueeze(1) * attn_output
+ hidden_states = hidden_states + attn_output
+
+ norm_hidden_states = self.norm2(hidden_states)
+ norm_hidden_states = norm_hidden_states * (1 + scale_mlp[:, None]) + shift_mlp[:, None]
+
+ ff_output = self.ff(norm_hidden_states)
+ ff_output = gate_mlp.unsqueeze(1) * ff_output
+
+ hidden_states = hidden_states + ff_output
+
+ # Process attention outputs for the `encoder_hidden_states`.
+
+ context_attn_output = c_gate_msa.unsqueeze(1) * context_attn_output
+ encoder_hidden_states = encoder_hidden_states + context_attn_output
+
+ norm_encoder_hidden_states = self.norm2_context(encoder_hidden_states)
+ norm_encoder_hidden_states = norm_encoder_hidden_states * (1 + c_scale_mlp[:, None]) + c_shift_mlp[:, None]
+
+ context_ff_output = self.ff_context(norm_encoder_hidden_states)
+ encoder_hidden_states = encoder_hidden_states + c_gate_mlp.unsqueeze(1) * context_ff_output
+ if encoder_hidden_states.dtype == torch.float16:
+ encoder_hidden_states = encoder_hidden_states.clip(-65504, 65504)
+
+ return encoder_hidden_states, hidden_states
+
+
+class UVit2DConvEmbed(nn.Module):
+ def __init__(self, in_channels, block_out_channels, vocab_size, elementwise_affine, eps, bias):
+ super().__init__()
+ self.embeddings = nn.Embedding(vocab_size, in_channels)
+ self.layer_norm = RMSNorm(in_channels, eps, elementwise_affine)
+ self.conv = nn.Conv2d(in_channels, block_out_channels, kernel_size=1, bias=bias)
+
+ def forward(self, input_ids):
+ embeddings = self.embeddings(input_ids)
+ embeddings = self.layer_norm(embeddings)
+ embeddings = embeddings.permute(0, 3, 1, 2)
+ embeddings = self.conv(embeddings)
+ return embeddings
+
+class ConvMlmLayer(nn.Module):
+ def __init__(
+ self,
+ block_out_channels: int,
+ in_channels: int,
+ use_bias: bool,
+ ln_elementwise_affine: bool,
+ layer_norm_eps: float,
+ codebook_size: int,
+ ):
+ super().__init__()
+ self.conv1 = nn.Conv2d(block_out_channels, in_channels, kernel_size=1, bias=use_bias)
+ self.layer_norm = RMSNorm(in_channels, layer_norm_eps, ln_elementwise_affine)
+ self.conv2 = nn.Conv2d(in_channels, codebook_size, kernel_size=1, bias=use_bias)
+
+ def forward(self, hidden_states):
+ hidden_states = self.conv1(hidden_states)
+ hidden_states = self.layer_norm(hidden_states.permute(0, 2, 3, 1)).permute(0, 3, 1, 2)
+ logits = self.conv2(hidden_states)
+ return logits
+
+class SwiGLU(nn.Module):
+ r"""
+ A [variant](https://arxiv.org/abs/2002.05202) of the gated linear unit activation function. It's similar to `GEGLU`
+ but uses SiLU / Swish instead of GeLU.
+
+ Parameters:
+ dim_in (`int`): The number of channels in the input.
+ dim_out (`int`): The number of channels in the output.
+ bias (`bool`, defaults to True): Whether to use a bias in the linear layer.
+ """
+
+ def __init__(self, dim_in: int, dim_out: int, bias: bool = True):
+ super().__init__()
+ self.proj = nn.Linear(dim_in, dim_out * 2, bias=bias)
+ self.activation = nn.SiLU()
+
+ def forward(self, hidden_states):
+ hidden_states = self.proj(hidden_states)
+ hidden_states, gate = hidden_states.chunk(2, dim=-1)
+ return hidden_states * self.activation(gate)
+
+class ConvNextBlock(nn.Module):
+ def __init__(
+ self, channels, layer_norm_eps, ln_elementwise_affine, use_bias, hidden_dropout, hidden_size, res_ffn_factor=4
+ ):
+ super().__init__()
+ self.depthwise = nn.Conv2d(
+ channels,
+ channels,
+ kernel_size=3,
+ padding=1,
+ groups=channels,
+ bias=use_bias,
+ )
+ self.norm = RMSNorm(channels, layer_norm_eps, ln_elementwise_affine)
+ self.channelwise_linear_1 = nn.Linear(channels, int(channels * res_ffn_factor), bias=use_bias)
+ self.channelwise_act = nn.GELU()
+ self.channelwise_norm = GlobalResponseNorm(int(channels * res_ffn_factor))
+ self.channelwise_linear_2 = nn.Linear(int(channels * res_ffn_factor), channels, bias=use_bias)
+ self.channelwise_dropout = nn.Dropout(hidden_dropout)
+ self.cond_embeds_mapper = nn.Linear(hidden_size, channels * 2, use_bias)
+
+ def forward(self, x, cond_embeds):
+ x_res = x
+
+ x = self.depthwise(x)
+
+ x = x.permute(0, 2, 3, 1)
+ x = self.norm(x)
+
+ x = self.channelwise_linear_1(x)
+ x = self.channelwise_act(x)
+ x = self.channelwise_norm(x)
+ x = self.channelwise_linear_2(x)
+ x = self.channelwise_dropout(x)
+
+ x = x.permute(0, 3, 1, 2)
+
+ x = x + x_res
+
+ scale, shift = self.cond_embeds_mapper(F.silu(cond_embeds)).chunk(2, dim=1)
+ x = x * (1 + scale[:, :, None, None]) + shift[:, :, None, None]
+
+ return x
+
+class Simple_UVitBlock(nn.Module):
+ def __init__(
+ self,
+ channels,
+ ln_elementwise_affine,
+ layer_norm_eps,
+ use_bias,
+ downsample: bool,
+ upsample: bool,
+ ):
+ super().__init__()
+
+ if downsample:
+ self.downsample = Downsample2D(
+ channels,
+ use_conv=True,
+ padding=0,
+ name="Conv2d_0",
+ kernel_size=2,
+ norm_type="rms_norm",
+ eps=layer_norm_eps,
+ elementwise_affine=ln_elementwise_affine,
+ bias=use_bias,
+ )
+ else:
+ self.downsample = None
+
+ if upsample:
+ self.upsample = Upsample2D(
+ channels,
+ use_conv_transpose=True,
+ kernel_size=2,
+ padding=0,
+ name="conv",
+ norm_type="rms_norm",
+ eps=layer_norm_eps,
+ elementwise_affine=ln_elementwise_affine,
+ bias=use_bias,
+ interpolate=False,
+ )
+ else:
+ self.upsample = None
+
+ def forward(self, x):
+ # print("before,", x.shape)
+ if self.downsample is not None:
+ # print('downsample')
+ x = self.downsample(x)
+
+ if self.upsample is not None:
+ # print('upsample')
+ x = self.upsample(x)
+ # print("after,", x.shape)
+ return x
+
+class Transformer2DModel(ModelMixin, ConfigMixin, PeftAdapterMixin, FromOriginalModelMixin):
+ """
+ The Transformer model introduced in Flux.
+
+ Reference: https://blackforestlabs.ai/announcing-black-forest-labs/
+
+ Parameters:
+ patch_size (`int`): Patch size to turn the input data into small patches.
+ in_channels (`int`, *optional*, defaults to 16): The number of channels in the input.
+ num_layers (`int`, *optional*, defaults to 18): The number of layers of MMDiT blocks to use.
+ num_single_layers (`int`, *optional*, defaults to 18): The number of layers of single DiT blocks to use.
+ attention_head_dim (`int`, *optional*, defaults to 64): The number of channels in each head.
+ num_attention_heads (`int`, *optional*, defaults to 18): The number of heads to use for multi-head attention.
+ joint_attention_dim (`int`, *optional*): The number of `encoder_hidden_states` dimensions to use.
+ pooled_projection_dim (`int`): Number of dimensions to use when projecting the `pooled_projections`.
+ guidance_embeds (`bool`, defaults to False): Whether to use guidance embeddings.
+ """
+
+ _supports_gradient_checkpointing = False #True
+ # Due to NotImplementedError: DDPOptimizer backend: Found a higher order op in the graph. This is not supported. Please turn off DDP optimizer using torch._dynamo.config.optimize_ddp=False. Note that this can cause performance degradation because there will be one bucket for the entire Dynamo graph.
+ # Please refer to this issue - https://github.com/pytorch/pytorch/issues/104674.
+ _no_split_modules = ["TransformerBlock", "SingleTransformerBlock"]
+
+ @register_to_config
+ def __init__(
+ self,
+ patch_size: int = 1,
+ in_channels: int = 64,
+ num_layers: int = 19,
+ num_single_layers: int = 38,
+ attention_head_dim: int = 128,
+ num_attention_heads: int = 24,
+ joint_attention_dim: int = 4096,
+ pooled_projection_dim: int = 768,
+ guidance_embeds: bool = False, # unused in our implementation
+ axes_dims_rope: Tuple[int] = (16, 56, 56),
+ vocab_size: int = 8256,
+ codebook_size: int = 8192,
+ downsample: bool = False,
+ upsample: bool = False,
+ ):
+ super().__init__()
+ self.out_channels = in_channels
+ self.inner_dim = self.config.num_attention_heads * self.config.attention_head_dim
+
+ self.pos_embed = FluxPosEmbed(theta=10000, axes_dim=axes_dims_rope)
+ text_time_guidance_cls = (
+ CombinedTimestepGuidanceTextProjEmbeddings if guidance_embeds else CombinedTimestepTextProjEmbeddings
+ )
+ self.time_text_embed = text_time_guidance_cls(
+ embedding_dim=self.inner_dim, pooled_projection_dim=self.config.pooled_projection_dim
+ )
+
+ self.context_embedder = nn.Linear(self.config.joint_attention_dim, self.inner_dim)
+
+ self.transformer_blocks = nn.ModuleList(
+ [
+ TransformerBlock(
+ dim=self.inner_dim,
+ num_attention_heads=self.config.num_attention_heads,
+ attention_head_dim=self.config.attention_head_dim,
+ )
+ for i in range(self.config.num_layers)
+ ]
+ )
+
+ self.single_transformer_blocks = nn.ModuleList(
+ [
+ SingleTransformerBlock(
+ dim=self.inner_dim,
+ num_attention_heads=self.config.num_attention_heads,
+ attention_head_dim=self.config.attention_head_dim,
+ )
+ for i in range(self.config.num_single_layers)
+ ]
+ )
+
+
+ self.gradient_checkpointing = False
+
+ in_channels_embed = self.inner_dim
+ ln_elementwise_affine = True
+ layer_norm_eps = 1e-06
+ use_bias = False
+ micro_cond_embed_dim = 1280
+ self.embed = UVit2DConvEmbed(
+ in_channels_embed, self.inner_dim, self.config.vocab_size, ln_elementwise_affine, layer_norm_eps, use_bias
+ )
+ self.mlm_layer = ConvMlmLayer(
+ self.inner_dim, in_channels_embed, use_bias, ln_elementwise_affine, layer_norm_eps, self.config.codebook_size
+ )
+ self.cond_embed = TimestepEmbedding(
+ micro_cond_embed_dim + self.config.pooled_projection_dim, self.inner_dim, sample_proj_bias=use_bias
+ )
+ self.encoder_proj_layer_norm = RMSNorm(self.inner_dim, layer_norm_eps, ln_elementwise_affine)
+ self.project_to_hidden_norm = RMSNorm(in_channels_embed, layer_norm_eps, ln_elementwise_affine)
+ self.project_to_hidden = nn.Linear(in_channels_embed, self.inner_dim, bias=use_bias)
+ self.project_from_hidden_norm = RMSNorm(self.inner_dim, layer_norm_eps, ln_elementwise_affine)
+ self.project_from_hidden = nn.Linear(self.inner_dim, in_channels_embed, bias=use_bias)
+
+ self.down_block = Simple_UVitBlock(
+ self.inner_dim,
+ ln_elementwise_affine,
+ layer_norm_eps,
+ use_bias,
+ downsample,
+ False,
+ )
+ self.up_block = Simple_UVitBlock(
+ self.inner_dim, #block_out_channels,
+ ln_elementwise_affine,
+ layer_norm_eps,
+ use_bias,
+ False,
+ upsample=upsample,
+ )
+
+ # self.fuse_qkv_projections()
+
+ @property
+ # Copied from diffusers.models.unets.unet_2d_condition.UNet2DConditionModel.attn_processors
+ def attn_processors(self) -> Dict[str, AttentionProcessor]:
+ r"""
+ Returns:
+ `dict` of attention processors: A dictionary containing all attention processors used in the model with
+ indexed by its weight name.
+ """
+ # set recursively
+ processors = {}
+
+ def fn_recursive_add_processors(name: str, module: torch.nn.Module, processors: Dict[str, AttentionProcessor]):
+ if hasattr(module, "get_processor"):
+ processors[f"{name}.processor"] = module.get_processor()
+
+ for sub_name, child in module.named_children():
+ fn_recursive_add_processors(f"{name}.{sub_name}", child, processors)
+
+ return processors
+
+ for name, module in self.named_children():
+ fn_recursive_add_processors(name, module, processors)
+
+ return processors
+
+ # Copied from diffusers.models.unets.unet_2d_condition.UNet2DConditionModel.set_attn_processor
+ def set_attn_processor(self, processor: Union[AttentionProcessor, Dict[str, AttentionProcessor]]):
+ r"""
+ Sets the attention processor to use to compute attention.
+
+ Parameters:
+ processor (`dict` of `AttentionProcessor` or only `AttentionProcessor`):
+ The instantiated processor class or a dictionary of processor classes that will be set as the processor
+ for **all** `Attention` layers.
+
+ If `processor` is a dict, the key needs to define the path to the corresponding cross attention
+ processor. This is strongly recommended when setting trainable attention processors.
+
+ """
+ count = len(self.attn_processors.keys())
+
+ if isinstance(processor, dict) and len(processor) != count:
+ raise ValueError(
+ f"A dict of processors was passed, but the number of processors {len(processor)} does not match the"
+ f" number of attention layers: {count}. Please make sure to pass {count} processor classes."
+ )
+
+ def fn_recursive_attn_processor(name: str, module: torch.nn.Module, processor):
+ if hasattr(module, "set_processor"):
+ if not isinstance(processor, dict):
+ module.set_processor(processor)
+ else:
+ module.set_processor(processor.pop(f"{name}.processor"))
+
+ for sub_name, child in module.named_children():
+ fn_recursive_attn_processor(f"{name}.{sub_name}", child, processor)
+
+ for name, module in self.named_children():
+ fn_recursive_attn_processor(name, module, processor)
+
+ # Copied from diffusers.models.unets.unet_2d_condition.UNet2DConditionModel.fuse_qkv_projections with FusedAttnProcessor2_0->FusedFluxAttnProcessor2_0
+ def fuse_qkv_projections(self):
+ """
+ Enables fused QKV projections. For self-attention modules, all projection matrices (i.e., query, key, value)
+ are fused. For cross-attention modules, key and value projection matrices are fused.
+
+
+
+ This API is 🧪 experimental.
+
+
+ """
+ self.original_attn_processors = None
+
+ for _, attn_processor in self.attn_processors.items():
+ if "Added" in str(attn_processor.__class__.__name__):
+ raise ValueError("`fuse_qkv_projections()` is not supported for models having added KV projections.")
+
+ self.original_attn_processors = self.attn_processors
+
+ for module in self.modules():
+ if isinstance(module, Attention):
+ module.fuse_projections(fuse=True)
+
+ self.set_attn_processor(FusedFluxAttnProcessor2_0())
+
+ # Copied from diffusers.models.unets.unet_2d_condition.UNet2DConditionModel.unfuse_qkv_projections
+ def unfuse_qkv_projections(self):
+ """Disables the fused QKV projection if enabled.
+
+
+
+ This API is 🧪 experimental.
+
+
+
+ """
+ if self.original_attn_processors is not None:
+ self.set_attn_processor(self.original_attn_processors)
+
+ def _set_gradient_checkpointing(self, module, value=False):
+ if hasattr(module, "gradient_checkpointing"):
+ module.gradient_checkpointing = value
+
+ def forward(
+ self,
+ hidden_states: torch.Tensor,
+ encoder_hidden_states: torch.Tensor = None,
+ pooled_projections: torch.Tensor = None,
+ timestep: torch.LongTensor = None,
+ img_ids: torch.Tensor = None,
+ txt_ids: torch.Tensor = None,
+ guidance: torch.Tensor = None,
+ joint_attention_kwargs: Optional[Dict[str, Any]] = None,
+ controlnet_block_samples= None,
+ controlnet_single_block_samples=None,
+ return_dict: bool = True,
+ micro_conds: torch.Tensor = None,
+ ) -> Union[torch.FloatTensor, Transformer2DModelOutput]:
+ """
+ The [`FluxTransformer2DModel`] forward method.
+
+ Args:
+ hidden_states (`torch.FloatTensor` of shape `(batch size, channel, height, width)`):
+ Input `hidden_states`.
+ encoder_hidden_states (`torch.FloatTensor` of shape `(batch size, sequence_len, embed_dims)`):
+ Conditional embeddings (embeddings computed from the input conditions such as prompts) to use.
+ pooled_projections (`torch.FloatTensor` of shape `(batch_size, projection_dim)`): Embeddings projected
+ from the embeddings of input conditions.
+ timestep ( `torch.LongTensor`):
+ Used to indicate denoising step.
+ block_controlnet_hidden_states: (`list` of `torch.Tensor`):
+ A list of tensors that if specified are added to the residuals of transformer blocks.
+ joint_attention_kwargs (`dict`, *optional*):
+ A kwargs dictionary that if specified is passed along to the `AttentionProcessor` as defined under
+ `self.processor` in
+ [diffusers.models.attention_processor](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/attention_processor.py).
+ return_dict (`bool`, *optional*, defaults to `True`):
+ Whether or not to return a [`~models.transformer_2d.Transformer2DModelOutput`] instead of a plain
+ tuple.
+
+ Returns:
+ If `return_dict` is True, an [`~models.transformer_2d.Transformer2DModelOutput`] is returned, otherwise a
+ `tuple` where the first element is the sample tensor.
+ """
+ micro_cond_encode_dim = 256 # same as self.config.micro_cond_encode_dim = 256 from amused
+ micro_cond_embeds = get_timestep_embedding(
+ micro_conds.flatten(), micro_cond_encode_dim, flip_sin_to_cos=True, downscale_freq_shift=0
+ )
+ micro_cond_embeds = micro_cond_embeds.reshape((hidden_states.shape[0], -1))
+
+ pooled_projections = torch.cat([pooled_projections, micro_cond_embeds], dim=1)
+ pooled_projections = pooled_projections.to(dtype=self.dtype)
+ pooled_projections = self.cond_embed(pooled_projections).to(encoder_hidden_states.dtype)
+
+
+ hidden_states = self.embed(hidden_states)
+
+ encoder_hidden_states = self.context_embedder(encoder_hidden_states)
+ encoder_hidden_states = self.encoder_proj_layer_norm(encoder_hidden_states)
+ hidden_states = self.down_block(hidden_states)
+
+ batch_size, channels, height, width = hidden_states.shape
+ hidden_states = hidden_states.permute(0, 2, 3, 1).reshape(batch_size, height * width, channels)
+ hidden_states = self.project_to_hidden_norm(hidden_states)
+ hidden_states = self.project_to_hidden(hidden_states)
+
+
+ if joint_attention_kwargs is not None:
+ joint_attention_kwargs = joint_attention_kwargs.copy()
+ lora_scale = joint_attention_kwargs.pop("scale", 1.0)
+ else:
+ lora_scale = 1.0
+
+ if USE_PEFT_BACKEND:
+ # weight the lora layers by setting `lora_scale` for each PEFT layer
+ scale_lora_layers(self, lora_scale)
+ else:
+ if joint_attention_kwargs is not None and joint_attention_kwargs.get("scale", None) is not None:
+ logger.warning(
+ "Passing `scale` via `joint_attention_kwargs` when not using the PEFT backend is ineffective."
+ )
+
+ timestep = timestep.to(hidden_states.dtype) * 1000
+ if guidance is not None:
+ guidance = guidance.to(hidden_states.dtype) * 1000
+ else:
+ guidance = None
+ temb = (
+ self.time_text_embed(timestep, pooled_projections)
+ if guidance is None
+ else self.time_text_embed(timestep, guidance, pooled_projections)
+ )
+
+ if txt_ids.ndim == 3:
+ logger.warning(
+ "Passing `txt_ids` 3d torch.Tensor is deprecated."
+ "Please remove the batch dimension and pass it as a 2d torch Tensor"
+ )
+ txt_ids = txt_ids[0]
+ if img_ids.ndim == 3:
+ logger.warning(
+ "Passing `img_ids` 3d torch.Tensor is deprecated."
+ "Please remove the batch dimension and pass it as a 2d torch Tensor"
+ )
+ img_ids = img_ids[0]
+ ids = torch.cat((txt_ids, img_ids), dim=0)
+
+ image_rotary_emb = self.pos_embed(ids)
+
+ for index_block, block in enumerate(self.transformer_blocks):
+ if self.training and self.gradient_checkpointing:
+
+ def create_custom_forward(module, return_dict=None):
+ def custom_forward(*inputs):
+ if return_dict is not None:
+ return module(*inputs, return_dict=return_dict)
+ else:
+ return module(*inputs)
+
+ return custom_forward
+
+ ckpt_kwargs: Dict[str, Any] = {"use_reentrant": False} if is_torch_version(">=", "1.11.0") else {}
+ encoder_hidden_states, hidden_states = torch.utils.checkpoint.checkpoint(
+ create_custom_forward(block),
+ hidden_states,
+ encoder_hidden_states,
+ temb,
+ image_rotary_emb,
+ **ckpt_kwargs,
+ )
+
+ else:
+ encoder_hidden_states, hidden_states = block(
+ hidden_states=hidden_states,
+ encoder_hidden_states=encoder_hidden_states,
+ temb=temb,
+ image_rotary_emb=image_rotary_emb,
+ )
+
+
+ # controlnet residual
+ if controlnet_block_samples is not None:
+ interval_control = len(self.transformer_blocks) / len(controlnet_block_samples)
+ interval_control = int(np.ceil(interval_control))
+ hidden_states = hidden_states + controlnet_block_samples[index_block // interval_control]
+
+ hidden_states = torch.cat([encoder_hidden_states, hidden_states], dim=1)
+
+ for index_block, block in enumerate(self.single_transformer_blocks):
+ if self.training and self.gradient_checkpointing:
+
+ def create_custom_forward(module, return_dict=None):
+ def custom_forward(*inputs):
+ if return_dict is not None:
+ return module(*inputs, return_dict=return_dict)
+ else:
+ return module(*inputs)
+
+ return custom_forward
+
+ ckpt_kwargs: Dict[str, Any] = {"use_reentrant": False} if is_torch_version(">=", "1.11.0") else {}
+ hidden_states = torch.utils.checkpoint.checkpoint(
+ create_custom_forward(block),
+ hidden_states,
+ temb,
+ image_rotary_emb,
+ **ckpt_kwargs,
+ )
+
+ else:
+ hidden_states = block(
+ hidden_states=hidden_states,
+ temb=temb,
+ image_rotary_emb=image_rotary_emb,
+ )
+
+ # controlnet residual
+ if controlnet_single_block_samples is not None:
+ interval_control = len(self.single_transformer_blocks) / len(controlnet_single_block_samples)
+ interval_control = int(np.ceil(interval_control))
+ hidden_states[:, encoder_hidden_states.shape[1] :, ...] = (
+ hidden_states[:, encoder_hidden_states.shape[1] :, ...]
+ + controlnet_single_block_samples[index_block // interval_control]
+ )
+
+ hidden_states = hidden_states[:, encoder_hidden_states.shape[1] :, ...]
+
+
+ hidden_states = self.project_from_hidden_norm(hidden_states)
+ hidden_states = self.project_from_hidden(hidden_states)
+
+
+ hidden_states = hidden_states.reshape(batch_size, height, width, channels).permute(0, 3, 1, 2)
+
+ hidden_states = self.up_block(hidden_states)
+
+ if USE_PEFT_BACKEND:
+ # remove `lora_scale` from each PEFT layer
+ unscale_lora_layers(self, lora_scale)
+
+ output = self.mlm_layer(hidden_states)
+ # self.unfuse_qkv_projections()
+ if not return_dict:
+ return (output,)
+
+
+ return output
\ No newline at end of file
diff --git a/Meissonic/src/transformer_video.py b/Meissonic/src/transformer_video.py
new file mode 100644
index 0000000000000000000000000000000000000000..d16c9629c95c3f9dca6a949d991c9a973481336d
--- /dev/null
+++ b/Meissonic/src/transformer_video.py
@@ -0,0 +1,1379 @@
+
+import math
+from typing import Optional
+import os
+DISABLE_FLASH_ATTN = os.environ.get("WAN_DISABLE_FLASH_ATTN", "0") == "1"
+
+import torch
+import torch.nn as nn
+from diffusers.configuration_utils import ConfigMixin, register_to_config
+from diffusers.models.modeling_utils import ModelMixin
+
+# Global debug flag - set to False to disable debug prints
+DEBUG_TRANSFORMER = False
+
+# from .attention import flash_attention
+import torch
+
+try:
+ import flash_attn_interface
+ FLASH_ATTN_3_AVAILABLE = True
+except ModuleNotFoundError:
+ FLASH_ATTN_3_AVAILABLE = False
+
+try:
+ import flash_attn
+ FLASH_ATTN_2_AVAILABLE = True
+except ModuleNotFoundError:
+ FLASH_ATTN_2_AVAILABLE = False
+
+import warnings
+
+__all__ = [
+ 'flash_attention',
+ 'attention',
+]
+
+def _sdpa_attention(
+ q, k, v,
+ q_lens=None,
+ k_lens=None,
+ dropout_p=0.0,
+ softmax_scale=None,
+ q_scale=None,
+ causal=False,
+ dtype=torch.bfloat16,
+):
+ # q/k/v: [B, L, N, D]
+ B, Lq, N, D = q.shape
+ Lk = k.shape[1]
+ out_dtype = q.dtype
+
+ # cast
+ q = q.to(dtype)
+ k = k.to(dtype)
+ v = v.to(dtype)
+
+ # emulate flash-attn softmax_scale override (SDPA默认是 1/sqrt(D))
+ # 若 softmax_scale = 1/sqrt(D)(常见默认),下面乘子会变成 1,不影响
+ if softmax_scale is not None:
+ q = q * (softmax_scale * math.sqrt(D))
+ if q_scale is not None:
+ q = q * q_scale
+
+ # [B, N, L, D]
+ q = q.transpose(1, 2)
+ k = k.transpose(1, 2)
+ v = v.transpose(1, 2)
+
+ attn_mask = None
+
+ # key padding mask: shape [B, 1, 1, Lk] additive -inf
+ if k_lens is not None:
+ # ensure on same device/dtype
+ k_lens = k_lens.to(device=q.device, dtype=torch.long)
+
+ kpos = torch.arange(Lk, device=q.device).view(1, 1, 1, Lk)
+ invalid = kpos >= k_lens.view(B, 1, 1, 1)
+
+ attn_mask = torch.zeros((B, 1, 1, Lk), device=q.device, dtype=q.dtype)
+ attn_mask.masked_fill_(invalid, float("-inf"))
+
+ if q_lens is not None:
+ q_lens = q_lens.to(device=q.device, dtype=torch.long)
+
+ # 目前不额外处理 q_lens(很多实现也不处理,通常 loss 会忽略 pad query)
+ out = torch.nn.functional.scaled_dot_product_attention(
+ q, k, v,
+ attn_mask=attn_mask,
+ is_causal=causal,
+ dropout_p=dropout_p,
+ )
+
+ out = out.transpose(1, 2).contiguous() # [B, Lq, N, D]
+ return out.to(out_dtype)
+def flash_attention(
+ q,
+ k,
+ v,
+ q_lens=None,
+ k_lens=None,
+ dropout_p=0.,
+ softmax_scale=None,
+ q_scale=None,
+ causal=False,
+ window_size=(-1, -1),
+ deterministic=False,
+ dtype=torch.bfloat16,
+ version=None,
+):
+ half_dtypes = (torch.float16, torch.bfloat16)
+ assert dtype in half_dtypes
+
+ # 如果禁用 flash-attn 或者根本没装:fallback 到 SDPA(不需要外部依赖)
+ use_fa3 = (not DISABLE_FLASH_ATTN) and FLASH_ATTN_3_AVAILABLE and (version is None or version == 3)
+ use_fa2 = (not DISABLE_FLASH_ATTN) and (not use_fa3) and FLASH_ATTN_2_AVAILABLE
+
+ if not (use_fa3 or use_fa2):
+ if window_size != (-1, -1):
+ warnings.warn(
+ "flash-attn not available: window_size local attention will be ignored and fall back to global SDPA."
+ )
+ return _sdpa_attention(
+ q=q, k=k, v=v,
+ q_lens=q_lens, k_lens=k_lens,
+ dropout_p=dropout_p,
+ softmax_scale=softmax_scale,
+ q_scale=q_scale,
+ causal=causal,
+ dtype=dtype,
+ )
+
+ # ===== 原来的 flash-attn 路径(基本照搬)=====
+ assert q.device.type == 'cuda' and q.size(-1) <= 256
+
+ b, lq, lk, out_dtype = q.size(0), q.size(1), k.size(1), q.dtype
+
+ def half(x):
+ return x if x.dtype in half_dtypes else x.to(dtype)
+
+ # preprocess query
+ if q_lens is None:
+ q = half(q.flatten(0, 1))
+ q_lens = torch.tensor([lq] * b, dtype=torch.int32, device=q.device)
+ else:
+ # 你原实现这里会导致 unflatten 形状风险;一般别用 q_lens 变长路径
+ q = half(torch.cat([u[:v] for u, v in zip(q, q_lens)]))
+
+ # preprocess key, value
+ if k_lens is None:
+ k = half(k.flatten(0, 1))
+ v = half(v.flatten(0, 1))
+ k_lens = torch.tensor([lk] * b, dtype=torch.int32, device=k.device)
+ else:
+ k = half(torch.cat([u[:v] for u, v in zip(k, k_lens)]))
+ v = half(torch.cat([u[:v] for u, v in zip(v, k_lens)]))
+
+ q = q.to(v.dtype)
+ k = k.to(v.dtype)
+
+ if q_scale is not None:
+ q = q * q_scale
+
+ if use_fa3:
+ x = flash_attn_interface.flash_attn_varlen_func(
+ q=q,
+ k=k,
+ v=v,
+ cu_seqlens_q=torch.cat([q_lens.new_zeros([1]), q_lens]).cumsum(0, dtype=torch.int32),
+ cu_seqlens_k=torch.cat([k_lens.new_zeros([1]), k_lens]).cumsum(0, dtype=torch.int32),
+ seqused_q=None,
+ seqused_k=None,
+ max_seqlen_q=lq,
+ max_seqlen_k=lk,
+ softmax_scale=softmax_scale,
+ causal=causal,
+ deterministic=deterministic
+ )[0].unflatten(0, (b, lq))
+ else:
+ x = flash_attn.flash_attn_varlen_func(
+ q=q,
+ k=k,
+ v=v,
+ cu_seqlens_q=torch.cat([q_lens.new_zeros([1]), q_lens]).cumsum(0, dtype=torch.int32),
+ cu_seqlens_k=torch.cat([k_lens.new_zeros([1]), k_lens]).cumsum(0, dtype=torch.int32),
+ max_seqlen_q=lq,
+ max_seqlen_k=lk,
+ dropout_p=dropout_p,
+ softmax_scale=softmax_scale,
+ causal=causal,
+ window_size=window_size,
+ deterministic=deterministic
+ ).unflatten(0, (b, lq))
+
+ return x.type(out_dtype)
+
+
+# def flash_attention(
+# q,
+# k,
+# v,
+# q_lens=None,
+# k_lens=None,
+# dropout_p=0.,
+# softmax_scale=None,
+# q_scale=None,
+# causal=False,
+# window_size=(-1, -1),
+# deterministic=False,
+# dtype=torch.bfloat16,
+# version=None,
+# ):
+# """
+# q: [B, Lq, Nq, C1].
+# k: [B, Lk, Nk, C1].
+# v: [B, Lk, Nk, C2]. Nq must be divisible by Nk.
+# q_lens: [B].
+# k_lens: [B].
+# dropout_p: float. Dropout probability.
+# softmax_scale: float. The scaling of QK^T before applying softmax.
+# causal: bool. Whether to apply causal attention mask.
+# window_size: (left right). If not (-1, -1), apply sliding window local attention.
+# deterministic: bool. If True, slightly slower and uses more memory.
+# dtype: torch.dtype. Apply when dtype of q/k/v is not float16/bfloat16.
+# """
+# half_dtypes = (torch.float16, torch.bfloat16)
+# assert dtype in half_dtypes
+# assert q.device.type == 'cuda' and q.size(-1) <= 256
+
+# # params
+# b, lq, lk, out_dtype = q.size(0), q.size(1), k.size(1), q.dtype
+
+# def half(x):
+# return x if x.dtype in half_dtypes else x.to(dtype)
+
+# # preprocess query
+# if q_lens is None:
+# q = half(q.flatten(0, 1))
+# q_lens = torch.tensor(
+# [lq] * b, dtype=torch.int32).to(
+# device=q.device, non_blocking=True)
+# else:
+# q = half(torch.cat([u[:v] for u, v in zip(q, q_lens)]))
+
+# # preprocess key, value
+# if k_lens is None:
+# k = half(k.flatten(0, 1))
+# v = half(v.flatten(0, 1))
+# k_lens = torch.tensor(
+# [lk] * b, dtype=torch.int32).to(
+# device=k.device, non_blocking=True)
+# else:
+# k = half(torch.cat([u[:v] for u, v in zip(k, k_lens)]))
+# v = half(torch.cat([u[:v] for u, v in zip(v, k_lens)]))
+
+# q = q.to(v.dtype)
+# k = k.to(v.dtype)
+
+# if q_scale is not None:
+# q = q * q_scale
+
+# if version is not None and version == 3 and not FLASH_ATTN_3_AVAILABLE:
+# warnings.warn(
+# 'Flash attention 3 is not available, use flash attention 2 instead.'
+# )
+
+# # apply attention
+# if (version is None or version == 3) and FLASH_ATTN_3_AVAILABLE:
+# # Note: dropout_p, window_size are not supported in FA3 now.
+# x = flash_attn_interface.flash_attn_varlen_func(
+# q=q,
+# k=k,
+# v=v,
+# cu_seqlens_q=torch.cat([q_lens.new_zeros([1]), q_lens]).cumsum(
+# 0, dtype=torch.int32).to(q.device, non_blocking=True),
+# cu_seqlens_k=torch.cat([k_lens.new_zeros([1]), k_lens]).cumsum(
+# 0, dtype=torch.int32).to(q.device, non_blocking=True),
+# seqused_q=None,
+# seqused_k=None,
+# max_seqlen_q=lq,
+# max_seqlen_k=lk,
+# softmax_scale=softmax_scale,
+# causal=causal,
+# deterministic=deterministic)[0].unflatten(0, (b, lq))
+# else:
+# assert FLASH_ATTN_2_AVAILABLE
+# x = flash_attn.flash_attn_varlen_func(
+# q=q,
+# k=k,
+# v=v,
+# cu_seqlens_q=torch.cat([q_lens.new_zeros([1]), q_lens]).cumsum(
+# 0, dtype=torch.int32).to(q.device, non_blocking=True),
+# cu_seqlens_k=torch.cat([k_lens.new_zeros([1]), k_lens]).cumsum(
+# 0, dtype=torch.int32).to(q.device, non_blocking=True),
+# max_seqlen_q=lq,
+# max_seqlen_k=lk,
+# dropout_p=dropout_p,
+# softmax_scale=softmax_scale,
+# causal=causal,
+# window_size=window_size,
+# deterministic=deterministic).unflatten(0, (b, lq))
+
+# # output
+# return x.type(out_dtype)
+
+
+def attention(
+ q,
+ k,
+ v,
+ q_lens=None,
+ k_lens=None,
+ dropout_p=0.,
+ softmax_scale=None,
+ q_scale=None,
+ causal=False,
+ window_size=(-1, -1),
+ deterministic=False,
+ dtype=torch.bfloat16,
+ fa_version=None,
+):
+ if FLASH_ATTN_2_AVAILABLE or FLASH_ATTN_3_AVAILABLE:
+ return flash_attention(
+ q=q,
+ k=k,
+ v=v,
+ q_lens=q_lens,
+ k_lens=k_lens,
+ dropout_p=dropout_p,
+ softmax_scale=softmax_scale,
+ q_scale=q_scale,
+ causal=causal,
+ window_size=window_size,
+ deterministic=deterministic,
+ dtype=dtype,
+ version=fa_version,
+ )
+ else:
+ if q_lens is not None or k_lens is not None:
+ warnings.warn(
+ 'Padding mask is disabled when using scaled_dot_product_attention. It can have a significant impact on performance.'
+ )
+ attn_mask = None
+
+ q = q.transpose(1, 2).to(dtype)
+ k = k.transpose(1, 2).to(dtype)
+ v = v.transpose(1, 2).to(dtype)
+
+ out = torch.nn.functional.scaled_dot_product_attention(
+ q, k, v, attn_mask=attn_mask, is_causal=causal, dropout_p=dropout_p)
+
+ out = out.transpose(1, 2).contiguous()
+ return out
+
+
+__all__ = ['WanModel']
+
+
+def sinusoidal_embedding_1d(dim, position):
+ # preprocess
+ assert dim % 2 == 0
+ half = dim // 2
+ # Ensure position is on CPU for float64 computation to avoid CUDA issues
+ # Convert to float64 for precision, then move back to original device
+ device = position.device
+ position = position.to(torch.float64)
+
+ # calculation
+ # Create range tensor on same device as position
+ arange_tensor = torch.arange(half, dtype=torch.float64, device=device)
+ sinusoid = torch.outer(
+ position, torch.pow(10000, -arange_tensor.div(half)))
+ x = torch.cat([torch.cos(sinusoid), torch.sin(sinusoid)], dim=1)
+ return x
+
+
+@torch.amp.autocast('cuda', enabled=False)
+def rope_params(max_seq_len, dim, theta=10000):
+ assert dim % 2 == 0
+ freqs = torch.outer(
+ torch.arange(max_seq_len),
+ 1.0 / torch.pow(theta,
+ torch.arange(0, dim, 2).to(torch.float64).div(dim)))
+ freqs = torch.polar(torch.ones_like(freqs), freqs)
+ return freqs
+
+
+@torch.amp.autocast('cuda', enabled=False)
+def rope_apply(x, grid_sizes, freqs):
+ n, c = x.size(2), x.size(3) // 2
+ # Save original dtype to restore it later
+ original_dtype = x.dtype
+
+ # split freqs
+ freqs = freqs.split([c - 2 * (c // 3), c // 3, c // 3], dim=1)
+
+ # loop over samples
+ output = []
+ for i, (f, h, w) in enumerate(grid_sizes.tolist()):
+ seq_len = f * h * w
+
+ # precompute multipliers
+ x_i = torch.view_as_complex(x[i, :seq_len].to(torch.float64).reshape(
+ seq_len, n, -1, 2))
+ freqs_i = torch.cat([
+ freqs[0][:f].view(f, 1, 1, -1).expand(f, h, w, -1),
+ freqs[1][:h].view(1, h, 1, -1).expand(f, h, w, -1),
+ freqs[2][:w].view(1, 1, w, -1).expand(f, h, w, -1)
+ ],
+ dim=-1).reshape(seq_len, 1, -1)
+
+ # apply rotary embedding
+ x_i = torch.view_as_real(x_i * freqs_i).flatten(2)
+ # Convert back to original dtype before concatenating
+ x_i = x_i.to(dtype=original_dtype)
+ # Handle the remaining part of the sequence
+ x_remaining = x[i, seq_len:]
+ if x_remaining.numel() > 0:
+ x_i = torch.cat([x_i, x_remaining])
+ else:
+ x_i = x_i
+
+ # append to collection
+ output.append(x_i)
+ # Stack and ensure dtype matches original input
+ return torch.stack(output).to(dtype=original_dtype)
+
+
+class WanRMSNorm(nn.Module):
+
+ def __init__(self, dim, eps=1e-5):
+ super().__init__()
+ self.dim = dim
+ self.eps = eps
+ self.weight = nn.Parameter(torch.ones(dim))
+
+ def forward(self, x):
+ r"""
+ Args:
+ x(Tensor): Shape [B, L, C]
+ """
+ # Ensure weight dtype matches input dtype
+ return self._norm(x.float()).type_as(x) * self.weight.type_as(x)
+
+ def _norm(self, x):
+ return x * torch.rsqrt(x.pow(2).mean(dim=-1, keepdim=True) + self.eps)
+
+
+class WanLayerNorm(nn.LayerNorm):
+
+ def __init__(self, dim, eps=1e-6, elementwise_affine=False):
+ super().__init__(dim, elementwise_affine=elementwise_affine, eps=eps)
+
+ def forward(self, x):
+ r"""
+ Args:
+ x(Tensor): Shape [B, L, C]
+ """
+ # Convert to float32 for numerical stability, ensuring weights match input dtype
+ original_dtype = x.dtype
+ x_float = x.float()
+ if self.elementwise_affine:
+ weight_float = self.weight.float() if self.weight is not None else None
+ bias_float = self.bias.float() if self.bias is not None else None
+ # Use torch.nn.functional.layer_norm directly with converted weights
+ result = torch.nn.functional.layer_norm(x_float, self.normalized_shape, weight_float, bias_float, self.eps)
+ else:
+ result = super().forward(x_float)
+ return result.to(dtype=original_dtype)
+
+
+class WanSelfAttention(nn.Module):
+
+ def __init__(self,
+ dim,
+ num_heads,
+ window_size=(-1, -1),
+ qk_norm=True,
+ eps=1e-6):
+ assert dim % num_heads == 0
+ super().__init__()
+ self.dim = dim
+ self.num_heads = num_heads
+ self.head_dim = dim // num_heads
+ self.window_size = window_size
+ self.qk_norm = qk_norm
+ self.eps = eps
+
+ # layers
+ self.q = nn.Linear(dim, dim)
+ self.k = nn.Linear(dim, dim)
+ self.v = nn.Linear(dim, dim)
+ self.o = nn.Linear(dim, dim)
+ self.norm_q = WanRMSNorm(dim, eps=eps) if qk_norm else nn.Identity()
+ self.norm_k = WanRMSNorm(dim, eps=eps) if qk_norm else nn.Identity()
+
+ def forward(self, x, seq_lens, grid_sizes, freqs):
+ r"""
+ Args:
+ x(Tensor): Shape [B, L, num_heads, C / num_heads]
+ seq_lens(Tensor): Shape [B]
+ grid_sizes(Tensor): Shape [B, 3], the second dimension contains (F, H, W)
+ freqs(Tensor): Rope freqs, shape [1024, C / num_heads / 2]
+ """
+ b, s, n, d = *x.shape[:2], self.num_heads, self.head_dim
+
+ # query, key, value function
+ def qkv_fn(x):
+ q = self.norm_q(self.q(x)).view(b, s, n, d)
+ k = self.norm_k(self.k(x)).view(b, s, n, d)
+ v = self.v(x).view(b, s, n, d)
+ return q, k, v
+
+ q, k, v = qkv_fn(x)
+
+ # Save input dtype to ensure output matches
+ input_dtype = x.dtype
+
+ x = flash_attention(
+ q=rope_apply(q, grid_sizes, freqs),
+ k=rope_apply(k, grid_sizes, freqs),
+ v=v,
+ k_lens=seq_lens,
+ window_size=self.window_size)
+
+ # Ensure output dtype matches input dtype (in case rope_apply or flash_attention changed it)
+ x = x.to(dtype=input_dtype)
+
+ # output
+ x = x.flatten(2)
+ x = self.o(x)
+ return x
+
+
+class WanCrossAttention(WanSelfAttention):
+
+ def forward(self, x, context, context_lens):
+ r"""
+ Args:
+ x(Tensor): Shape [B, L1, C]
+ context(Tensor): Shape [B, L2, C]
+ context_lens(Tensor): Shape [B]
+ """
+ b, n, d = x.size(0), self.num_heads, self.head_dim
+
+ # Save input dtype to ensure output matches
+ input_dtype = x.dtype
+
+ # compute query, key, value
+ q = self.norm_q(self.q(x)).view(b, -1, n, d)
+ k = self.norm_k(self.k(context)).view(b, -1, n, d)
+ v = self.v(context).view(b, -1, n, d)
+
+ # compute attention
+ x = flash_attention(q, k, v, k_lens=context_lens)
+
+ # Ensure output dtype matches input dtype
+ x = x.to(dtype=input_dtype)
+
+ # output
+ x = x.flatten(2)
+ x = self.o(x)
+ return x
+
+
+class WanAttentionBlock(nn.Module):
+
+ def __init__(self,
+ dim,
+ ffn_dim,
+ num_heads,
+ window_size=(-1, -1),
+ qk_norm=True,
+ cross_attn_norm=False,
+ eps=1e-6):
+ super().__init__()
+ self.dim = dim
+ self.ffn_dim = ffn_dim
+ self.num_heads = num_heads
+ self.window_size = window_size
+ self.qk_norm = qk_norm
+ self.cross_attn_norm = cross_attn_norm
+ self.eps = eps
+
+ # layers
+ self.norm1 = WanLayerNorm(dim, eps)
+ self.self_attn = WanSelfAttention(dim, num_heads, window_size, qk_norm,
+ eps)
+ self.norm3 = WanLayerNorm(
+ dim, eps,
+ elementwise_affine=True) if cross_attn_norm else nn.Identity()
+ self.cross_attn = WanCrossAttention(dim, num_heads, (-1, -1), qk_norm,
+ eps)
+ self.norm2 = WanLayerNorm(dim, eps)
+ self.ffn = nn.Sequential(
+ nn.Linear(dim, ffn_dim), nn.GELU(approximate='tanh'),
+ nn.Linear(ffn_dim, dim))
+
+ # modulation
+ self.modulation = nn.Parameter(torch.randn(1, 6, dim) / dim**0.5)
+
+ def forward(
+ self,
+ x,
+ e,
+ seq_lens,
+ grid_sizes,
+ freqs,
+ context,
+ context_lens,
+ ):
+ r"""
+ Args:
+ x(Tensor): Shape [B, L, C]
+ e(Tensor): Shape [B, L1, 6, C]
+ seq_lens(Tensor): Shape [B], length of each sequence in batch
+ grid_sizes(Tensor): Shape [B, 3], the second dimension contains (F, H, W)
+ freqs(Tensor): Rope freqs, shape [1024, C / num_heads / 2]
+ """
+ # Convert e to float32 for modulation computation (modulation expects float32)
+ e_float32 = e.to(dtype=torch.float32) if e.dtype != torch.float32 else e
+ with torch.amp.autocast('cuda', dtype=torch.float32):
+ e = (self.modulation.unsqueeze(0) + e_float32).chunk(6, dim=2)
+ assert e[0].dtype == torch.float32
+
+ # self-attention
+ # Ensure input dtype matches model weights (convert e to match x's dtype)
+ x_dtype = x.dtype
+ e_0 = e[0].squeeze(2).to(dtype=x_dtype)
+ e_1 = e[1].squeeze(2).to(dtype=x_dtype)
+ e_2 = e[2].squeeze(2).to(dtype=x_dtype)
+ attn_input = self.norm1(x) * (1 + e_1) + e_0
+ y = self.self_attn(attn_input, seq_lens, grid_sizes, freqs)
+ # Ensure dtype consistency: y and e_2 should match x's dtype
+ x = x + (y * e_2).to(dtype=x_dtype)
+
+ # cross-attention & ffn function
+ def cross_attn_ffn(x, context, context_lens, e):
+ x = x + self.cross_attn(self.norm3(x), context, context_lens)
+ # Ensure dtype consistency for FFN input
+ x_dtype = x.dtype
+ e_3 = e[3].squeeze(2).to(dtype=x_dtype)
+ e_4 = e[4].squeeze(2).to(dtype=x_dtype)
+ e_5 = e[5].squeeze(2).to(dtype=x_dtype)
+ ffn_input = self.norm2(x) * (1 + e_4) + e_3
+ y = self.ffn(ffn_input)
+ # Ensure dtype consistency: y and e_5 should match x's dtype
+ x = x + (y * e_5).to(dtype=x_dtype)
+ return x
+
+ x = cross_attn_ffn(x, context, context_lens, e)
+ return x
+
+
+class Head(nn.Module):
+
+ def __init__(self, dim, out_dim, patch_size, eps=1e-6):
+ super().__init__()
+ self.dim = dim
+ self.out_dim = out_dim
+ self.patch_size = patch_size
+ self.eps = eps
+
+ # layers
+ out_dim = math.prod(patch_size) * out_dim
+ self.norm = WanLayerNorm(dim, eps)
+ self.head = nn.Linear(dim, out_dim)
+
+ # modulation
+ self.modulation = nn.Parameter(torch.randn(1, 2, dim) / dim**0.5)
+
+ def forward(self, x, e):
+ r"""
+ Args:
+ x(Tensor): Shape [B, L1, C]
+ e(Tensor): Shape [B, L1, C]
+ """
+ # Convert e to float32 for modulation computation (modulation expects float32)
+ e_float32 = e.to(dtype=torch.float32) if e.dtype != torch.float32 else e
+ with torch.amp.autocast('cuda', dtype=torch.float32):
+ e = (self.modulation.unsqueeze(0) + e_float32.unsqueeze(2)).chunk(2, dim=2)
+ # Ensure dtype consistency: convert e to match x's dtype
+ x_dtype = x.dtype
+ e_0 = e[0].squeeze(2).to(dtype=x_dtype)
+ e_1 = e[1].squeeze(2).to(dtype=x_dtype)
+ head_input = self.norm(x) * (1 + e_1) + e_0
+ x = self.head(head_input)
+ return x
+
+
+class WanModel(ModelMixin, ConfigMixin):
+ r"""
+ Wan diffusion backbone supporting both text-to-video and image-to-video.
+ """
+
+ ignore_for_config = [
+ 'patch_size', 'cross_attn_norm', 'qk_norm', 'text_dim', 'window_size'
+ ]
+ _no_split_modules = ['WanAttentionBlock']
+
+ @register_to_config
+ def __init__(self,
+ model_type='t2v',
+ patch_size=(1, 2, 2),
+ text_len=512,
+ in_dim=16,
+ dim=2048,
+ ffn_dim=8192,
+ freq_dim=256,
+ text_dim=4096,
+ out_dim=16,
+ num_heads=16,
+ num_layers=32,
+ window_size=(-1, -1),
+ qk_norm=True,
+ cross_attn_norm=True,
+ eps=1e-6):
+ r"""
+ Initialize the diffusion model backbone.
+
+ Args:
+ model_type (`str`, *optional*, defaults to 't2v'):
+ Model variant - 't2v' (text-to-video) or 'i2v' (image-to-video)
+ patch_size (`tuple`, *optional*, defaults to (1, 2, 2)):
+ 3D patch dimensions for video embedding (t_patch, h_patch, w_patch)
+ text_len (`int`, *optional*, defaults to 512):
+ Fixed length for text embeddings
+ in_dim (`int`, *optional*, defaults to 16):
+ Input video channels (C_in)
+ dim (`int`, *optional*, defaults to 2048):
+ Hidden dimension of the transformer
+ ffn_dim (`int`, *optional*, defaults to 8192):
+ Intermediate dimension in feed-forward network
+ freq_dim (`int`, *optional*, defaults to 256):
+ Dimension for sinusoidal time embeddings
+ text_dim (`int`, *optional*, defaults to 4096):
+ Input dimension for text embeddings
+ out_dim (`int`, *optional*, defaults to 16):
+ Output video channels (C_out)
+ num_heads (`int`, *optional*, defaults to 16):
+ Number of attention heads
+ num_layers (`int`, *optional*, defaults to 32):
+ Number of transformer blocks
+ window_size (`tuple`, *optional*, defaults to (-1, -1)):
+ Window size for local attention (-1 indicates global attention)
+ qk_norm (`bool`, *optional*, defaults to True):
+ Enable query/key normalization
+ cross_attn_norm (`bool`, *optional*, defaults to False):
+ Enable cross-attention normalization
+ eps (`float`, *optional*, defaults to 1e-6):
+ Epsilon value for normalization layers
+ """
+
+ super().__init__()
+
+ assert model_type in ['t2v', 'i2v', 'ti2v', 's2v']
+ self.model_type = model_type
+
+ self.patch_size = patch_size
+ self.text_len = text_len
+ self.in_dim = in_dim
+ self.dim = dim
+ self.ffn_dim = ffn_dim
+ self.freq_dim = freq_dim
+ self.text_dim = text_dim
+ self.out_dim = out_dim
+ self.num_heads = num_heads
+ self.num_layers = num_layers
+ self.window_size = window_size
+ self.qk_norm = qk_norm
+ self.cross_attn_norm = cross_attn_norm
+ self.eps = eps
+
+ # embeddings
+ self.patch_embedding = nn.Conv3d(
+ in_dim, dim, kernel_size=patch_size, stride=patch_size)
+ self.text_embedding = nn.Sequential(
+ nn.Linear(text_dim, dim), nn.GELU(approximate='tanh'),
+ nn.Linear(dim, dim))
+
+ self.time_embedding = nn.Sequential(
+ nn.Linear(freq_dim, dim), nn.SiLU(), nn.Linear(dim, dim))
+ self.time_projection = nn.Sequential(nn.SiLU(), nn.Linear(dim, dim * 6))
+
+ # blocks
+ self.blocks = nn.ModuleList([
+ WanAttentionBlock(dim, ffn_dim, num_heads, window_size, qk_norm,
+ cross_attn_norm, eps) for _ in range(num_layers)
+ ])
+
+ # head
+ self.head = Head(dim, out_dim, patch_size, eps)
+
+ # buffers (don't use register_buffer otherwise dtype will be changed in to())
+ assert (dim % num_heads) == 0 and (dim // num_heads) % 2 == 0
+ d = dim // num_heads
+ self.freqs = torch.cat([
+ rope_params(1024, d - 4 * (d // 6)),
+ rope_params(1024, 2 * (d // 6)),
+ rope_params(1024, 2 * (d // 6))
+ ],
+ dim=1)
+
+ # initialize weights
+ self.init_weights()
+
+ def forward(
+ self,
+ x,
+ t,
+ context,
+ seq_len,
+ y=None,
+ context_lens=None,
+ ):
+ r"""
+ Forward pass through the diffusion model
+
+ Args:
+ x (List[Tensor]):
+ List of input video tensors, each with shape [C_in, F, H, W]
+ t (Tensor):
+ Diffusion timesteps tensor of shape [B]
+ context (List[Tensor]):
+ List of text embeddings each with shape [L, C]
+ seq_len (`int`):
+ Maximum sequence length for positional encoding
+ y (List[Tensor], *optional*):
+ Conditional video inputs for image-to-video mode, same shape as x
+
+ Returns:
+ List[Tensor]:
+ List of denoised video tensors with original input shapes [C_out, F, H / 8, W / 8]
+ """
+ if self.model_type == 'i2v':
+ assert y is not None
+ # params
+ device = self.patch_embedding.weight.device
+ if self.freqs.device != device:
+ self.freqs = self.freqs.to(device)
+
+ if y is not None:
+ x = [torch.cat([u, v], dim=0) for u, v in zip(x, y)]
+
+ # embeddings
+ # Ensure input dtype matches patch_embedding weight dtype
+ patch_weight_dtype = self.patch_embedding.weight.dtype
+ x = [self.patch_embedding(u.unsqueeze(0).to(dtype=patch_weight_dtype)) for u in x]
+ grid_sizes = torch.stack(
+ [torch.tensor(u.shape[2:], dtype=torch.long) for u in x])
+ x = [u.flatten(2).transpose(1, 2) for u in x]
+ seq_lens = torch.tensor([u.size(1) for u in x], dtype=torch.long)
+ # seq_lens = torch.tensor([u.size(1) for u in x], dtype=torch.long, device=u.device)
+
+ assert seq_lens.max() <= seq_len
+ x = torch.cat([
+ torch.cat([u, u.new_zeros(1, seq_len - u.size(1), u.size(2))],
+ dim=1) for u in x
+ ])
+
+ # time embeddings
+ if t.dim() == 1:
+ t = t.expand(t.size(0), seq_len)
+ with torch.amp.autocast('cuda', dtype=torch.float32):
+ bt = t.size(0)
+ t = t.flatten()
+ e = self.time_embedding(
+ sinusoidal_embedding_1d(self.freq_dim,
+ t).unflatten(0, (bt, seq_len)).float())
+ e0 = self.time_projection(e).unflatten(2, (6, self.dim))
+ assert e.dtype == torch.float32 and e0.dtype == torch.float32
+
+ # Keep e and e0 as float32 for modulation computation
+ # They will be converted to x.dtype inside WanAttentionBlock.forward and Head.forward when needed
+
+ # context
+ # Use provided context_lens or compute from context if not provided
+ if context_lens is None:
+ # Fallback: assume all contexts have full length
+ context_lens = torch.full((len(context),), self.text_len, dtype=torch.long, device=x[0].device)
+ else:
+ # Ensure context_lens is on correct device
+ context_lens = context_lens.to(device=x[0].device, dtype=torch.long)
+
+ # Ensure context input dtype matches text_embedding weight dtype
+ text_weight_dtype = self.text_embedding[0].weight.dtype
+ context = self.text_embedding(
+ torch.stack([
+ torch.cat(
+ [u, u.new_zeros(self.text_len - u.size(0), u.size(1))])
+ for u in context
+ ]).to(dtype=text_weight_dtype))
+
+ # arguments
+ kwargs = dict(
+ e=e0,
+ seq_lens=seq_lens,
+ grid_sizes=grid_sizes,
+ freqs=self.freqs,
+ context=context,
+ context_lens=context_lens)
+
+ for block in self.blocks:
+ x = block(x, **kwargs)
+
+ # head
+ x = self.head(x, e)
+
+ # unpatchify
+ x = self.unpatchify(x, grid_sizes)
+ return [u.float() for u in x]
+
+ def unpatchify(self, x, grid_sizes):
+ r"""
+ Reconstruct video tensors from patch embeddings.
+
+ Args:
+ x (List[Tensor]):
+ List of patchified features, each with shape [L, C_out * prod(patch_size)]
+ grid_sizes (Tensor):
+ Original spatial-temporal grid dimensions before patching,
+ shape [B, 3] (3 dimensions correspond to F_patches, H_patches, W_patches)
+
+ Returns:
+ List[Tensor]:
+ Reconstructed video tensors with shape [C_out, F, H / 8, W / 8]
+ """
+
+ c = self.out_dim
+ out = []
+ for u, v in zip(x, grid_sizes.tolist()):
+ u = u[:math.prod(v)].view(*v, *self.patch_size, c)
+ u = torch.einsum('fhwpqrc->cfphqwr', u)
+ u = u.reshape(c, *[i * j for i, j in zip(v, self.patch_size)])
+ out.append(u)
+ return out
+
+ def init_weights(self):
+ r"""
+ Initialize model parameters using Xavier initialization.
+ """
+
+ # basic init
+ for m in self.modules():
+ if isinstance(m, nn.Linear):
+ nn.init.xavier_uniform_(m.weight)
+ if m.bias is not None:
+ nn.init.zeros_(m.bias)
+
+ # init embeddings
+ nn.init.xavier_uniform_(self.patch_embedding.weight.flatten(1))
+ for m in self.text_embedding.modules():
+ if isinstance(m, nn.Linear):
+ nn.init.normal_(m.weight, std=.02)
+ for m in self.time_embedding.modules():
+ if isinstance(m, nn.Linear):
+ nn.init.normal_(m.weight, std=.02)
+
+ # init output layer
+ nn.init.zeros_(self.head.head.weight)
+
+
+
+
+class WanDiscreteVideoTransformer(ModelMixin, ConfigMixin):
+ r"""
+ Wrapper around :class:`WanModel` that makes it usable as a **discrete video diffusion backbone**.
+
+ The goals of this wrapper are:
+
+ - keep the inner :class:`WanModel` architecture and parameter names intact so that Wan-1.3B
+ weights can later be loaded directly into ``self.backbone``;
+ - expose a simpler interface that takes **discrete codebook indices** (from a 2D VQ-VAE on
+ pseudo-video) and returns **logits over the codebook** for each spatio‑temporal position.
+
+ Notes
+ -----
+ - This class does **not** try to be drop‑in compatible with Meissonic's 2D ``Transformer2DModel``.
+ It is a parallel, video‑oriented path that still follows the same *discrete diffusion* principle:
+ predict per‑token logits given masked tokens + text.
+ - Pseudo‑video is represented as a 4D integer tensor ``[B, F, H, W]`` of codebook indices.
+ How to get these tokens from the current 2D VQ-VAE (e.g. per‑frame encoding & stacking)
+ is left to the higher‑level training / pipeline code.
+ """
+
+ _supports_gradient_checkpointing = True
+
+ @register_to_config
+ def __init__(
+ self,
+ # discrete codebook settings
+ codebook_size: int,
+ vocab_size: int,
+ # video layout
+ num_frames: int,
+ height: int,
+ width: int,
+ # Wan backbone hyper‑parameters (mirrors WanModel.__init__)
+ model_type: str = 't2v',
+ patch_size: tuple = (1, 2, 2),
+ text_len: int = 512,
+ in_dim: int = 16,
+ dim: int = 2048,
+ ffn_dim: int = 8192,
+ freq_dim: int = 256,
+ text_dim: int = 4096,
+ out_dim: int = 16,
+ num_heads: int = 16,
+ num_layers: int = 32,
+ window_size: tuple = (-1, -1),
+ qk_norm: bool = True,
+ cross_attn_norm: bool = True,
+ eps: float = 1e-6,
+ ):
+ super().__init__()
+
+ # save a minimal set of attributes useful for downstream tooling
+ self.codebook_size = codebook_size
+ self.vocab_size = vocab_size
+ self.num_frames = num_frames
+ self.height = height
+ self.width = width
+
+ # 1) backbone: keep WanModel intact for future weight loading
+ self.backbone = WanModel(
+ model_type=model_type,
+ patch_size=patch_size,
+ text_len=text_len,
+ in_dim=in_dim,
+ dim=dim,
+ ffn_dim=ffn_dim,
+ freq_dim=freq_dim,
+ text_dim=text_dim,
+ out_dim=out_dim,
+ num_heads=num_heads,
+ num_layers=num_layers,
+ window_size=window_size,
+ qk_norm=qk_norm,
+ cross_attn_norm=cross_attn_norm,
+ eps=eps,
+ )
+
+ # 2) discrete token embedding -> continuous video volume
+ #
+ # Input: tokens [B, F, H, W] with values in [0, vocab_size) where:
+ # - [0, codebook_size-1] = actual Cosmos codes (direct mapping, no shift)
+ # - codebook_size = mask_token_id (reserved for masking)
+ # Output: list of length B with tensors [in_dim, F, H, W]
+ #
+ # We keep this outside the backbone so that loading official Wan 1.3B weights
+ # into self.backbone will still work without clashes.
+ # Note: vocab_size = codebook_size + 1 to accommodate mask_token_id = codebook_size
+ self.token_embedding = nn.Embedding(vocab_size, in_dim)
+
+ # 3) projection from continuous video output -> logits over codebook
+ #
+ # Backbone output: list of B tensors [out_dim, F, H', W']
+ # We map it with a 3D 1x1x1 conv to [vocab_size, F, H', W'].
+ # Note: vocab_size = codebook_size + 1, where codebook_size is reserved for mask_token_id
+ self.logits_head = nn.Conv3d(out_dim, vocab_size, kernel_size=1)
+
+ # Gradient checkpointing support
+ self.gradient_checkpointing = False
+
+ def _tokens_to_video(self, tokens: torch.LongTensor) -> list:
+ r"""
+ Convert discrete tokens ``[B, F, H, W]`` into a list of length ``B`` where each element
+ is a dense video tensor ``[in_dim, F, H, W]`` suitable for :class:`WanModel`.
+
+ Note:
+ This method now supports dynamic input dimensions. The num_frames, height, width
+ stored in config are used as defaults/for seq_len calculation, but inputs can
+ have different dimensions as long as they're valid.
+ """
+ assert tokens.dim() == 4, f"expected [B, F, H, W] tokens, got {tokens.shape}"
+ # Dynamic dimensions - no strict dimension checks, WanModel handles variable sizes
+
+ # [B, F, H, W, in_dim]
+ # Ensure output dtype matches token_embedding weight dtype
+ x = self.token_embedding(tokens)
+ # Ensure dtype matches model's expected dtype (usually bfloat16 for mixed precision)
+ token_embedding_dtype = self.token_embedding.weight.dtype
+ x = x.to(dtype=token_embedding_dtype)
+ # [B, in_dim, F, H, W]
+ x = x.permute(0, 4, 1, 2, 3).contiguous()
+
+ # WanModel expects a list of [C_in, F, H, W]
+ return [x_i for x_i in x]
+
+ def _text_to_list(self, encoder_hidden_states: torch.Tensor) -> list:
+ r"""
+ Convert batched text embeddings ``[B, L, C]`` into the list-of-tensors format
+ expected by :class:`WanModel`.
+ """
+ assert encoder_hidden_states.dim() == 3, (
+ f"expected encoder_hidden_states [B, L, C], got {encoder_hidden_states.shape}")
+ return [e for e in encoder_hidden_states]
+
+ def _set_gradient_checkpointing(self, enable=True, gradient_checkpointing_func=None):
+ """Set gradient checkpointing for the module."""
+ self.gradient_checkpointing = enable
+
+ def forward(
+ self,
+ tokens: torch.LongTensor,
+ timesteps: torch.LongTensor,
+ encoder_hidden_states: torch.FloatTensor,
+ y: Optional[list] = None,
+ context_lens: Optional[torch.LongTensor] = None,
+ ) -> torch.FloatTensor:
+ r"""
+ Forward pass of the **discrete video transformer**.
+
+ Args:
+ tokens (`torch.LongTensor` of shape `[B, F, H, W]`):
+ Discrete codebook indices (e.g. from a 2D VQ-VAE applied frame‑wise).
+ timesteps (`torch.LongTensor` of shape `[B]` or `[B, F * H * W]`):
+ Diffusion timestep(s), following the same semantics as Meissonic's scalar timesteps.
+ encoder_hidden_states (`torch.FloatTensor` of shape `[B, L, C_text]`):
+ Text embeddings (e.g. from CLIP). Each sample corresponds to one video.
+ y (`Optional[list]`):
+ Optional conditional video list passed to the underlying :class:`WanModel`
+ for i2v / ti2v / s2v variants. For now this is surfaced as a raw passthrough
+ and can be left as ``None`` for pure text‑to‑video.
+
+ Returns:
+ `torch.FloatTensor`:
+ Logits over the codebook of shape `[B, codebook_size, F, H_out, W_out]`, where
+ `(H_out, W_out)` depend on the Wan patch configuration. For the default
+ `patch_size=(1, 2, 2)` and input ``H=W=height``, we have
+ ``H_out = height // 2`` and ``W_out = width // 2``.
+ """
+ device = tokens.device
+ if DEBUG_TRANSFORMER:
+ print(f"[DEBUG-transformer] Input: tokens.shape={tokens.shape}, encoder_hidden_states.shape={encoder_hidden_states.shape}, timesteps.shape={timesteps.shape}")
+ x_list = self._tokens_to_video(tokens)
+ context_list = self._text_to_list(encoder_hidden_states)
+ if DEBUG_TRANSFORMER:
+ print(f"[DEBUG-transformer] After conversion: len(x_list)={len(x_list)}, len(context_list)={len(context_list)}")
+ if len(x_list) > 0:
+ print(f"[DEBUG-transformer] x_list[0].shape={x_list[0].shape}")
+ if len(context_list) > 0:
+ print(f"[DEBUG-transformer] context_list[0].shape={context_list[0].shape}")
+
+ # Calculate seq_len from actual input dimensions (supports dynamic sizes)
+ # tokens: [B, F, H, W] -> after patchification: seq_len = F * (H/p_h) * (W/p_w)
+ _, f_in, h_in, w_in = tokens.shape
+ h_patch = h_in // self.backbone.patch_size[1]
+ w_patch = w_in // self.backbone.patch_size[2]
+ seq_len = f_in * h_patch * w_patch
+
+ # Prepare timesteps in the exact shape WanModel.forward expects.
+ # Its current implementation assumes `t` is either [B, seq_len] or will be
+ # expanded from 1D; the 1D branch is slightly buggy for non-singleton dims,
+ # so we always give it a [B, seq_len] tensor here.
+ if timesteps.dim() == 1:
+ # [B] -> [B, 1] -> [B, seq_len] (broadcast along sequence)
+ t_model = timesteps.to(device).unsqueeze(1).expand(-1, seq_len)
+ elif timesteps.dim() == 2:
+ assert timesteps.size(1) == seq_len, (
+ f"Expected timesteps second dim == seq_len ({seq_len}), "
+ f"but got {timesteps.size(1)}"
+ )
+ t_model = timesteps.to(device)
+ else:
+ raise ValueError(
+ f"Unsupported timesteps shape {timesteps.shape}; "
+ "expected [B] or [B, seq_len]"
+ )
+ if DEBUG_TRANSFORMER:
+ print(f"[DEBUG-transformer] t_model.shape={t_model.shape}")
+
+ # WanModel.forward expects:
+ # x: List[Tensor [C_in, F, H, W]]
+ # t: Tensor [B] or [B, seq_len]
+ # context: List[Tensor [L, C_text]]
+ # seq_len: int
+ # y: Optional[List[Tensor]]
+ # context_lens: Optional[Tensor [B]]
+ if self.training and self.gradient_checkpointing:
+ def create_custom_forward(module):
+ def custom_forward(*inputs):
+ # Unpack inputs: x_list, t, context_list, seq_len, y, context_lens
+ x_in, t_in, context_in, seq_len_in, y_in, context_lens_in = inputs
+ return module(x=x_in, t=t_in, context=context_in, seq_len=seq_len_in, y=y_in, context_lens=context_lens_in)
+ return custom_forward
+
+ # Use gradient checkpointing for the backbone
+ ckpt_kwargs = {"use_reentrant": False}
+ out_list = torch.utils.checkpoint.checkpoint(
+ create_custom_forward(self.backbone),
+ x_list,
+ t_model,
+ context_list,
+ seq_len,
+ y,
+ context_lens,
+ **ckpt_kwargs,
+ )
+ else:
+ out_list = self.backbone(
+ x=x_list,
+ t=t_model,
+ context=context_list,
+ seq_len=seq_len,
+ y=y,
+ context_lens=context_lens,
+ )
+ if DEBUG_TRANSFORMER:
+ print(f"[DEBUG-transformer] After backbone: len(out_list)={len(out_list)}")
+ if len(out_list) > 0:
+ print(f"[DEBUG-transformer] out_list[0].shape={out_list[0].shape}")
+
+ # out_list: length B, each [C_out, F, H_out, W_out]
+ vids = torch.stack(out_list, dim=0) # [B, C_out, F, H_out, W_out]
+ if DEBUG_TRANSFORMER:
+ print(f"[DEBUG-transformer] After stack: vids.shape={vids.shape}")
+ # Ensure vids dtype matches logits_head weight dtype
+ vids = vids.to(dtype=self.logits_head.weight.dtype)
+ logits = self.logits_head(vids) # [B, vocab_size, F, H_out, W_out] where vocab_size = codebook_size + 1
+ if DEBUG_TRANSFORMER:
+ print(f"[DEBUG-transformer] Final logits.shape={logits.shape}")
+ return logits
+
+# def _available_device():
+# return "cuda" if torch.cuda.is_available() else "cpu"
+
+
+# def test_wan_discrete_video_transformer_forward_and_shapes():
+# """
+# Basic smoke test:
+# - build a tiny WanDiscreteVideoTransformer
+# - run a forward pass with random pseudo-video tokens + random text
+# - check output shapes, parameter count and (if CUDA present) memory usage
+# """
+
+# device = _available_device()
+
+# # small config to keep the test lightweight
+# codebook_size = 128
+# vocab_size = codebook_size + 1 # reserve one for mask if needed later
+# num_frames = 2
+# height = 16
+# width = 16
+
+# model = WanDiscreteVideoTransformer(
+# codebook_size=codebook_size,
+# vocab_size=vocab_size,
+# num_frames=num_frames,
+# height=height,
+# width=width,
+# # shrink Wan backbone for the unit test
+# in_dim=32,
+# dim=64,
+# ffn_dim=128,
+# freq_dim=32,
+# text_dim=64,
+# out_dim=32,
+# num_heads=4,
+# num_layers=2,
+# ).to(device)
+# model.eval()
+
+# batch_size = 2
+
+# # pseudo-video tokens from 2D VQ-VAE on frames: [B, F, H, W]
+# tokens = torch.randint(
+# low=0,
+# high=codebook_size,
+# size=(batch_size, num_frames, height, width),
+# dtype=torch.long,
+# device=device,
+# )
+
+# # text: [B, L, C_text]
+# text_seq_len = 8
+# encoder_hidden_states = torch.randn(
+# batch_size, text_seq_len, model.backbone.text_dim, device=device
+# )
+
+# # timesteps: [B]
+# timesteps = torch.randint(
+# low=0, high=1000, size=(batch_size,), dtype=torch.long, device=device
+# )
+
+# # track memory if CUDA is available
+# if device == "cuda":
+# torch.cuda.reset_peak_memory_stats()
+# mem_before = torch.cuda.memory_allocated()
+# else:
+# mem_before = 0
+
+# with torch.no_grad():
+# logits = model(
+# tokens=tokens,
+# timesteps=timesteps,
+# encoder_hidden_states=encoder_hidden_states,
+# y=None,
+# )
+
+# if device == "cuda":
+# mem_after = torch.cuda.memory_allocated()
+# peak_mem = torch.cuda.max_memory_allocated()
+# else:
+# mem_after = mem_before
+# peak_mem = mem_before
+
+# # logits: [B, codebook_size, F, H_out, W_out]
+# assert logits.shape[0] == batch_size
+# assert logits.shape[1] == codebook_size
+# assert logits.shape[2] == num_frames
+
+# # WanModel returns unpatchified videos, so spatial size matches the input grid.
+# h_out = height
+# w_out = width
+# assert logits.shape[3] == h_out
+# assert logits.shape[4] == w_out
+
+# # parameter count sanity check (just ensure it's > 0 and finite)
+# num_params = sum(p.numel() for p in model.parameters())
+# assert num_params > 0
+# assert math.isfinite(float(num_params))
+
+# # memory sanity check (on CUDA the forward pass should allocate > 0 bytes)
+# if device == "cuda":
+# assert peak_mem >= mem_after >= mem_before
+
+
+
+# import torch
+# from safetensors import safe_open
+# # from src.transformer_video import WanDiscreteVideoTransformer
+
+# ckpt_path = "/mnt/Meissonic/model/diffusion_pytorch_model.safetensors"
+
+# # 1) 按你想匹配 wan2.1 的超参实例化(这里写一份常用配置,务必与 ckpt 对齐)
+# model = WanDiscreteVideoTransformer(
+# codebook_size=128, # 离散侧自定义
+# vocab_size=129,
+# num_frames=2,
+# height=16,
+# width=16,
+# # Wan backbone 超参需与 ckpt 完全一致
+# model_type="t2v",
+# patch_size=(1, 2, 2),
+# in_dim=16,
+# dim=1536,
+# ffn_dim=8960,
+# freq_dim=256,
+# text_dim=4096,
+# out_dim=16,
+# num_heads=12,
+# num_layers=30,
+# window_size=(-1, -1),
+# qk_norm=True,
+# cross_attn_norm=True,
+# eps=1e-6,
+# )
+
+# # 2) 读取 safetensors
+# state_dict = {}
+# with safe_open(ckpt_path, framework="pt", device="cpu") as f:
+# for k in f.keys():
+# state_dict[k] = f.get_tensor(k)
+
+# # 3) 尝试加载到 backbone(不碰 token_embedding/logits_head)
+# missing, unexpected = model.backbone.load_state_dict(state_dict, strict=False)
+
+# print("Missing keys:", missing[:50], "... total", len(missing))
+# print("Unexpected keys:", unexpected[:50], "... total", len(unexpected))
+# print("Backbone params (M):", sum(p.numel() for p in model.backbone.parameters()) / 1e6)
+# print("Params (M):", sum(p.numel() for p in model.parameters()) / 1e6)
+
+# # if __name__ == '__main__':
+# # # test_wan_discrete_video_transformer_forward_and_shapes()
+# # print('WanDiscreteVideoTransformer forward pass test: PASSED')
+
+
+
+
\ No newline at end of file
diff --git a/Meissonic/tests/test.py b/Meissonic/tests/test.py
new file mode 100644
index 0000000000000000000000000000000000000000..a93eee1f28259d6ad78c0ee8ba9157dfabdb5d54
--- /dev/null
+++ b/Meissonic/tests/test.py
@@ -0,0 +1,111 @@
+import math
+
+import torch
+
+from src.transformer_video import WanDiscreteVideoTransformer
+
+
+def _available_device():
+ return "cuda" if torch.cuda.is_available() else "cpu"
+
+
+def test_wan_discrete_video_transformer_forward_and_shapes():
+ """
+ Basic smoke test:
+ - build a tiny WanDiscreteVideoTransformer
+ - run a forward pass with random pseudo-video tokens + random text
+ - check output shapes, parameter count and (if CUDA present) memory usage
+ """
+
+ device = _available_device()
+
+ # small config to keep the test lightweight
+ codebook_size = 128
+ vocab_size = codebook_size + 1 # reserve one for mask if needed later
+ num_frames = 2
+ height = 16
+ width = 16
+
+ model = WanDiscreteVideoTransformer(
+ codebook_size=codebook_size,
+ vocab_size=vocab_size,
+ num_frames=num_frames,
+ height=height,
+ width=width,
+ # shrink Wan backbone for the unit test
+ in_dim=32,
+ dim=64,
+ ffn_dim=128,
+ freq_dim=32,
+ text_dim=64,
+ out_dim=32,
+ num_heads=4,
+ num_layers=2,
+ ).to(device)
+ model.eval()
+
+ batch_size = 2
+
+ # pseudo-video tokens from 2D VQ-VAE on frames: [B, F, H, W]
+ tokens = torch.randint(
+ low=0,
+ high=codebook_size,
+ size=(batch_size, num_frames, height, width),
+ dtype=torch.long,
+ device=device,
+ )
+
+ # text: [B, L, C_text]
+ text_seq_len = 8
+ encoder_hidden_states = torch.randn(
+ batch_size, text_seq_len, model.backbone.text_dim, device=device
+ )
+
+ # timesteps: [B]
+ timesteps = torch.randint(
+ low=0, high=1000, size=(batch_size,), dtype=torch.long, device=device
+ )
+
+ # track memory if CUDA is available
+ if device == "cuda":
+ torch.cuda.reset_peak_memory_stats()
+ mem_before = torch.cuda.memory_allocated()
+ else:
+ mem_before = 0
+
+ with torch.no_grad():
+ logits = model(
+ tokens=tokens,
+ timesteps=timesteps,
+ encoder_hidden_states=encoder_hidden_states,
+ y=None,
+ )
+
+ if device == "cuda":
+ mem_after = torch.cuda.memory_allocated()
+ peak_mem = torch.cuda.max_memory_allocated()
+ else:
+ mem_after = mem_before
+ peak_mem = mem_before
+
+ # logits: [B, codebook_size, F, H_out, W_out]
+ assert logits.shape[0] == batch_size
+ assert logits.shape[1] == codebook_size
+ assert logits.shape[2] == num_frames
+
+ # spatial size after Wan patch embedding with default patch_size (1, 2, 2)
+ h_out = height // model.backbone.patch_size[1]
+ w_out = width // model.backbone.patch_size[2]
+ assert logits.shape[3] == h_out
+ assert logits.shape[4] == w_out
+
+ # parameter count sanity check (just ensure it's > 0 and finite)
+ num_params = sum(p.numel() for p in model.parameters())
+ assert num_params > 0
+ assert math.isfinite(float(num_params))
+
+ # memory sanity check (on CUDA the forward pass should allocate > 0 bytes)
+ if device == "cuda":
+ assert peak_mem >= mem_after >= mem_before
+
+
diff --git a/Meissonic/train/FEATURE_EXTRACTION_README.md b/Meissonic/train/FEATURE_EXTRACTION_README.md
new file mode 100644
index 0000000000000000000000000000000000000000..064843a220cf2442413f8d30ed72363e735a9f07
--- /dev/null
+++ b/Meissonic/train/FEATURE_EXTRACTION_README.md
@@ -0,0 +1,142 @@
+# 特征提取和预计算特征训练指南
+
+本文档说明如何使用预提取的特征来加速视频训练。
+
+## 概述
+
+为了提升训练效率,我们可以预先提取:
+1. **视频特征(codes)**:使用 CosmosVideoTokenizer 将视频编码为离散token
+2. **文本特征(embeddings)**:使用 T5/UMT5 将文本编码为embedding
+
+训练时直接加载这些预提取的特征,避免每次训练都重新编码。
+
+## 步骤1:提取特征
+
+使用 `extract_features.py` 脚本提取特征:
+
+```bash
+python train/extract_features.py \
+ --csv_path /path/to/OpenVid1M_reorganized.csv \
+ --video_root_dir /path/to/video_reorg \
+ --output_dir /path/to/extracted_features \
+ --text_encoder_architecture umt5-base \
+ --video_tokenizer_model_id Cosmos-1.0-Tokenizer-DV8x16x16 \
+ --num_frames 16 \
+ --video_height 480 \
+ --video_width 848 \
+ --batch_size 4 \
+ --num_workers 4
+```
+
+### 参数说明
+
+- `--csv_path`: OpenVid1M CSV文件路径
+- `--video_root_dir`: 视频文件根目录(可选,会自动检测)
+- `--output_dir`: 特征保存目录
+- `--text_encoder_architecture`: 文本编码器架构(umt5-base/umt5-xxl/t5)
+- `--video_tokenizer_model_id`: Cosmos视频tokenizer模型ID
+- `--num_frames`, `--video_height`, `--video_width`: 视频参数
+- `--batch_size`: 批处理大小(根据GPU内存调整)
+- `--num_workers`: 数据加载器工作进程数
+- `--max_samples`: 最大处理样本数(用于测试,默认处理全部)
+- `--resume_from_index`: 从指定索引恢复提取(用于中断后恢复)
+
+### 输出结构
+
+提取的特征会保存在 `output_dir` 下,采用三层目录结构以避免单个文件夹下文件过多:
+
+```
+extracted_features/
+├── video_codes/ # 视频codes(三层目录结构)
+│ ├── 000/ # 第一层:index // 1000000
+│ │ ├── 000/ # 第二层:(index // 1000) % 1000
+│ │ │ ├── 000/ # 第三层:index % 1000
+│ │ │ │ ├── 00000000.npy
+│ │ │ │ └── ...
+│ │ │ └── ...
+│ │ └── ...
+│ └── ...
+├── text_embeddings/ # 文本embeddings(三层目录结构)
+│ ├── 000/
+│ │ ├── 000/
+│ │ │ ├── 000/
+│ │ │ │ ├── 00000000.npy
+│ │ │ │ └── ...
+│ │ │ └── ...
+│ │ └── ...
+│ └── ...
+└── metadata.json # 元数据(包含样本信息)
+```
+
+**目录结构说明**:
+- 对于索引 `index`,文件路径为:`level1/level2/level3/index.npy`
+ - `level1 = index // 1000000` (0-999)
+ - `level2 = (index // 1000) % 1000` (0-999)
+ - `level3 = index % 1000` (0-999)
+- 例如:索引 `1234567` 的文件路径为 `001/234/567/1234567.npy`
+
+这种结构可以支持最多1,000,000,000个样本,每层最多1000个文件夹,避免单个文件夹下文件过多的问题。
+
+## 步骤2:使用预提取特征训练
+
+在训练脚本中使用 `--use_precomputed_features` 和 `--features_dir` 参数:
+
+```bash
+python train/train_mei_video.py \
+ --use_precomputed_features \
+ --features_dir /path/to/extracted_features \
+ --text_encoder_architecture umt5-base \
+ --video_tokenizer_model_id Cosmos-1.0-Tokenizer-DV8x16x16 \
+ --num_frames 16 \
+ --video_height 480 \
+ --video_width 848 \
+ --train_batch_size 8 \
+ --learning_rate 3e-4 \
+ --max_train_steps 10000 \
+ --output_dir ./output \
+ --mixed_precision bf16 \
+ --gradient_checkpointing \
+ --wan_pretrained_path /path/to/wan/weights \
+ --wan_backbone_lr_ratio 0.1 \
+ --freeze_wan_backbone # 可选:冻结backbone
+```
+
+### 关键参数
+
+- `--use_precomputed_features`: 启用预提取特征模式
+- `--features_dir`: 预提取特征的目录路径
+- 其他训练参数保持不变
+
+## 优势
+
+1. **训练速度提升**:避免每次训练都重新编码视频和文本
+2. **GPU利用率提升**:减少CPU-GPU数据传输
+3. **内存效率**:特征文件比原始视频小得多
+4. **可重复性**:使用相同的特征确保训练一致性
+
+## 注意事项
+
+1. **特征一致性**:确保提取特征时使用的参数(num_frames, height, width, text_encoder)与训练时一致
+2. **存储空间**:1M样本的特征大约需要几十GB存储空间
+3. **恢复提取**:如果提取中断,可以使用 `--resume_from_index` 参数恢复
+4. **验证**:训练时仍需要text_encoder进行验证,但不会用于训练数据编码
+
+## 故障排除
+
+### 特征提取失败
+
+- 检查视频文件路径是否正确
+- 检查GPU内存是否足够(可以减小batch_size)
+- 查看日志中的错误信息
+
+### 训练时找不到特征
+
+- 确认 `--features_dir` 路径正确
+- 确认特征文件存在(检查 `video_codes/` 和 `text_embeddings/` 目录)
+- 检查 `metadata.json` 文件是否存在
+
+### 维度不匹配
+
+- 确保提取和训练时使用相同的视频参数(num_frames, height, width)
+- 确保使用相同的text_encoder架构
+
diff --git a/Meissonic/train/check_codebook_range.py b/Meissonic/train/check_codebook_range.py
new file mode 100644
index 0000000000000000000000000000000000000000..3f015b106b650c0af6fe21e1e3e2819e5e954c19
--- /dev/null
+++ b/Meissonic/train/check_codebook_range.py
@@ -0,0 +1,295 @@
+#!/usr/bin/env python3
+"""
+Check codebook range by iterating through videos and extracting codes.
+
+This script loads videos from the dataset, encodes them to get video codes,
+and tracks the min/max values to determine the codebook range.
+"""
+
+import argparse
+import os
+import sys
+import logging
+from tqdm import tqdm
+import torch
+import numpy as np
+
+sys.path.append(os.getcwd())
+
+from train.dataset_utils import OpenVid1MDataset, PrecomputedFeatureDataset
+from src.pipeline_video import CosmosVideoTokenizer
+from transformers import T5Tokenizer
+from torch.utils.data import DataLoader
+
+logging.basicConfig(
+ format="%(asctime)s - %(levelname)s - %(name)s - %(message)s",
+ datefmt="%m/%d/%Y %H:%M:%S",
+ level=logging.INFO,
+)
+logger = logging.getLogger(__name__)
+
+
+def parse_args():
+ parser = argparse.ArgumentParser(description="Check codebook range from video dataset")
+
+ parser.add_argument(
+ "--csv_path",
+ type=str,
+ default=None,
+ help="Path to OpenVid1M CSV file (if using raw videos)",
+ )
+ parser.add_argument(
+ "--video_root_dir",
+ type=str,
+ default=None,
+ help="Root directory containing video files",
+ )
+ parser.add_argument(
+ "--features_dir",
+ type=str,
+ default=None,
+ help="Directory containing pre-extracted features (if using precomputed features)",
+ )
+ parser.add_argument(
+ "--video_tokenizer_model_id",
+ type=str,
+ default="Cosmos-1.0-Tokenizer-DV8x16x16",
+ help="HuggingFace model ID for Cosmos video tokenizer",
+ )
+ parser.add_argument(
+ "--num_frames",
+ type=int,
+ default=16,
+ help="Number of frames per video",
+ )
+ parser.add_argument(
+ "--video_height",
+ type=int,
+ default=480,
+ help="Video height",
+ )
+ parser.add_argument(
+ "--video_width",
+ type=int,
+ default=848,
+ help="Video width",
+ )
+ parser.add_argument(
+ "--text_encoder_architecture",
+ type=str,
+ default="umt5-base",
+ choices=["umt5-base", "umt5-xxl", "t5"],
+ help="Text encoder architecture",
+ )
+ parser.add_argument(
+ "--batch_size",
+ type=int,
+ default=1,
+ help="Batch size (use 1 for detailed per-sample tracking)",
+ )
+ parser.add_argument(
+ "--max_samples",
+ type=int,
+ default=None,
+ help="Maximum number of samples to check. If None, check all.",
+ )
+ parser.add_argument(
+ "--check_interval",
+ type=int,
+ default=10,
+ help="Print statistics every N samples",
+ )
+
+ return parser.parse_args()
+
+
+def main():
+ args = parse_args()
+
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+ dtype = torch.float32
+
+ logger.info(f"Using device: {device}")
+
+ # Initialize video tokenizer (only needed if not using precomputed features)
+ video_tokenizer = None
+ use_precomputed = args.features_dir is not None
+
+ if not use_precomputed:
+ if args.csv_path is None:
+ raise ValueError("Either --csv_path or --features_dir must be provided")
+
+ logger.info(f"Loading video tokenizer: {args.video_tokenizer_model_id}")
+ video_tokenizer = CosmosVideoTokenizer(
+ model_id=args.video_tokenizer_model_id,
+ device=device,
+ dtype=dtype
+ )
+ video_tokenizer.requires_grad_(False)
+ video_tokenizer.eval()
+
+ # Get tokenizer info
+ logger.info(f"Video tokenizer codebook_size: {video_tokenizer.codebook_size}")
+ logger.info(f"Video tokenizer mask_token_id: {video_tokenizer.mask_token_id}")
+
+ # Create dataset
+ if use_precomputed:
+ logger.info(f"Using precomputed features from: {args.features_dir}")
+ dataset = PrecomputedFeatureDataset(
+ features_dir=args.features_dir,
+ num_samples=args.max_samples,
+ )
+ else:
+ # Auto-detect video_root_dir if not provided
+ if args.video_root_dir is None:
+ csv_dir = os.path.dirname(args.csv_path)
+ if os.path.exists(os.path.join(csv_dir, 'video_reorg')):
+ video_root_dir = os.path.join(csv_dir, 'video_reorg')
+ elif os.path.exists(os.path.join(os.path.dirname(csv_dir), 'video_reorg')):
+ video_root_dir = os.path.join(os.path.dirname(csv_dir), 'video_reorg')
+ else:
+ video_root_dir = csv_dir
+ logger.warning(f"Video directory not found, using CSV directory: {video_root_dir}")
+ else:
+ video_root_dir = args.video_root_dir
+
+ # Create tokenizer for dataset
+ if args.text_encoder_architecture == "umt5-base":
+ model_id = "google/umt5-base"
+ elif args.text_encoder_architecture == "umt5-xxl":
+ model_id = "google/umt5-xxl"
+ elif args.text_encoder_architecture == "t5":
+ model_id = "t5-base"
+ else:
+ raise ValueError(f"Unknown text encoder: {args.text_encoder_architecture}")
+
+ tokenizer = T5Tokenizer.from_pretrained(model_id)
+
+ dataset = OpenVid1MDataset(
+ csv_path=args.csv_path,
+ video_root_dir=video_root_dir,
+ tokenizer=tokenizer,
+ num_frames=args.num_frames,
+ height=args.video_height,
+ width=args.video_width,
+ text_encoder_architecture=args.text_encoder_architecture,
+ use_random_temporal_crop=False, # Fixed sampling for consistency
+ use_random_crop=False, # Center crop for consistency
+ )
+
+ if args.max_samples is not None:
+ dataset.data = dataset.data[:args.max_samples]
+ logger.info(f"Limited dataset to {len(dataset)} samples")
+
+ logger.info(f"Dataset size: {len(dataset)}")
+
+ # Create dataloader
+ dataloader = DataLoader(
+ dataset,
+ batch_size=args.batch_size,
+ shuffle=False,
+ num_workers=0, # Use 0 to avoid multiprocessing issues
+ pin_memory=False,
+ )
+
+ # Initialize statistics
+ global_min = None
+ global_max = None
+ total_samples = 0
+ failed_samples = 0
+
+ logger.info("Starting to check codebook range...")
+ logger.info("=" * 80)
+
+ with torch.no_grad():
+ for batch_idx, batch in enumerate(tqdm(dataloader, desc="Checking codes")):
+ try:
+ if use_precomputed:
+ # Use pre-extracted video codes
+ video_codes = batch["video_codes"] # [B, F', H', W']
+ if isinstance(video_codes, torch.Tensor):
+ video_codes = video_codes.long()
+ else:
+ video_codes = torch.from_numpy(video_codes).long()
+ else:
+ # Encode videos to get codes
+ videos = batch["video"].to(device, non_blocking=True) # [B, C, F, H, W]
+ video_codes = video_tokenizer.encode(videos) # [B, F', H', W']
+ video_codes = video_codes.cpu().long()
+
+ # Update statistics
+ batch_min = video_codes.min().item()
+ batch_max = video_codes.max().item()
+
+ if global_min is None:
+ global_min = batch_min
+ global_max = batch_max
+ else:
+ global_min = min(global_min, batch_min)
+ global_max = max(global_max, batch_max)
+
+ total_samples += video_codes.shape[0]
+
+ # Print statistics periodically
+ if (batch_idx + 1) % args.check_interval == 0 or batch_idx == 0:
+ print(f"\n[Sample {total_samples}]")
+ print(f" Current batch range: [{batch_min}, {batch_max}]")
+ print(f" Global range so far: [{global_min}, {global_max}]")
+ print(f" Codebook size (expected): {video_tokenizer.codebook_size if video_tokenizer else 'N/A'}")
+ if video_tokenizer:
+ expected_max = video_tokenizer.codebook_size - 1
+ print(f" Expected max (codebook_size - 1): {expected_max}")
+ if global_max > expected_max:
+ print(f" ⚠️ WARNING: Found code {global_max} > expected max {expected_max}!")
+ if global_min < 0:
+ print(f" ⚠️ WARNING: Found code {global_min} < 0!")
+
+ # Print unique values count for current batch
+ unique_values = torch.unique(video_codes).tolist()
+ print(f" Unique values in batch: {len(unique_values)}")
+ if len(unique_values) <= 20:
+ print(f" Values: {sorted(unique_values)}")
+ else:
+ print(f" Min unique: {min(unique_values)}, Max unique: {max(unique_values)}")
+ print("-" * 80)
+
+ except Exception as e:
+ failed_samples += args.batch_size
+ logger.error(f"Failed to process batch {batch_idx}: {e}")
+ continue
+
+ # Final summary
+ logger.info("=" * 80)
+ logger.info("FINAL STATISTICS:")
+ logger.info(f" Total samples processed: {total_samples}")
+ logger.info(f" Failed samples: {failed_samples}")
+ logger.info(f" Global min code: {global_min}")
+ logger.info(f" Global max code: {global_max}")
+ logger.info(f" Code range: [{global_min}, {global_max}]")
+
+ if video_tokenizer:
+ expected_max = video_tokenizer.codebook_size - 1
+ logger.info(f" Expected max (codebook_size - 1): {expected_max}")
+ logger.info(f" Codebook size: {video_tokenizer.codebook_size}")
+ logger.info(f" Mask token ID: {video_tokenizer.mask_token_id}")
+
+ if global_max > expected_max:
+ logger.warning(f" ⚠️ WARNING: Found code {global_max} > expected max {expected_max}!")
+ elif global_max == expected_max:
+ logger.info(f" ✓ Max code matches expected max")
+ else:
+ logger.info(f" Note: Max code {global_max} < expected max {expected_max} (some codes may not be used)")
+
+ if global_min < 0:
+ logger.warning(f" ⚠️ WARNING: Found code {global_min} < 0!")
+ elif global_min == 0:
+ logger.info(f" ✓ Min code is 0 (as expected)")
+ else:
+ logger.info(f" Note: Min code {global_min} > 0 (some codes may not be used)")
+
+ logger.info("=" * 80)
+
+
+if __name__ == "__main__":
+ main()
+
diff --git a/Meissonic/train/dataset_utils.py b/Meissonic/train/dataset_utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..350968ac56c1fdc16408b045aefb40cb7327509a
--- /dev/null
+++ b/Meissonic/train/dataset_utils.py
@@ -0,0 +1,1240 @@
+# Copyright 2024 The HuggingFace Team and The MeissonFlow Team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import torch
+from torch.utils.data import Dataset
+from torchvision import transforms
+from PIL.ImageOps import exif_transpose
+from PIL import Image
+import io
+import pyarrow.parquet as pq
+import random
+import bisect
+import pyarrow.fs as fs
+import csv
+import numpy as np
+import logging
+
+logger = logging.getLogger(__name__)
+
+@torch.no_grad()
+def tokenize_prompt(tokenizer, prompt, text_encoder_architecture='open_clip'): # support open_clip, CLIP, T5/UMT5
+ if text_encoder_architecture == 'CLIP' or text_encoder_architecture == 'open_clip':
+ tokenizer_output = tokenizer(
+ prompt,
+ truncation=True,
+ padding="max_length",
+ max_length=77,
+ return_tensors="pt",
+ )
+ return tokenizer_output.input_ids, tokenizer_output.attention_mask
+ elif text_encoder_architecture in ['umt5-base', 'umt5-xxl', 't5']:
+ # T5/UMT5 tokenizer - return both input_ids and attention_mask
+ tokenizer_output = tokenizer(
+ prompt,
+ truncation=True,
+ padding="max_length",
+ max_length=512,
+ return_tensors="pt",
+ )
+ return tokenizer_output.input_ids, tokenizer_output.attention_mask
+ elif text_encoder_architecture == 'CLIP_T5_base': # we have two tokenizers, 1st for CLIP, 2nd for T5
+ input_ids = []
+ input_ids.append(tokenizer[0](
+ prompt,
+ truncation=True,
+ padding="max_length",
+ max_length=77,
+ return_tensors="pt",
+ ).input_ids)
+ input_ids.append(tokenizer[1](
+ prompt,
+ truncation=True,
+ padding="max_length",
+ max_length=512,
+ return_tensors="pt",
+ ).input_ids)
+ return input_ids
+ else:
+ raise ValueError(f"Unknown text_encoder_architecture: {text_encoder_architecture}")
+
+def encode_prompt(text_encoder, input_ids, text_encoder_architecture='open_clip', attention_mask=None): # support open_clip, CLIP, T5/UMT5
+ if text_encoder_architecture == 'CLIP' or text_encoder_architecture == 'open_clip':
+ outputs = text_encoder(input_ids=input_ids, return_dict=True, output_hidden_states=True)
+ encoder_hidden_states = outputs.hidden_states[-2]
+ cond_embeds = outputs[0]
+ return encoder_hidden_states, cond_embeds, None # CLIP不需要context_lens
+ elif text_encoder_architecture in ['umt5-base', 'umt5-xxl', 't5']:
+ # T5/UMT5 encoder - only returns encoder_hidden_states, no pooled projection
+ outputs = text_encoder(input_ids=input_ids, return_dict=True)
+ encoder_hidden_states = outputs.last_hidden_state
+ # For T5, we don't have a pooled projection, so return None or a dummy tensor
+ # The video pipeline doesn't use cond_embeds, so we can return None
+ cond_embeds = None
+
+ # Calculate actual context lengths from input_ids
+ # Find the actual length by looking for padding tokens
+ batch_size = input_ids.shape[0]
+ context_lens_list = []
+ for i in range(batch_size):
+ input_ids_cpu = input_ids[i].cpu()
+ # Find first padding token or use full length
+ # For T5, padding token is typically 0, but let's be safe
+ actual_len = 512
+ for j in range(512):
+ # Check for common padding tokens (0, 1 for some models)
+ if input_ids_cpu[j] in [0, 1] or input_ids_cpu[j] == text_encoder.config.pad_token_id:
+ actual_len = j
+ break
+ # Ensure minimum length of 1
+ actual_len = max(actual_len, 1)
+ context_lens_list.append(actual_len)
+ context_lens = torch.tensor(context_lens_list, dtype=torch.long, device=input_ids.device)
+
+ return encoder_hidden_states, cond_embeds, context_lens
+ elif text_encoder_architecture == 'CLIP_T5_base':
+ outputs_clip = text_encoder[0](input_ids=input_ids[0], return_dict=True, output_hidden_states=True)
+ outputs_t5 = text_encoder[1](input_ids=input_ids[1], decoder_input_ids=torch.zeros_like(input_ids[1]),
+ return_dict=True, output_hidden_states=True)
+ encoder_hidden_states = outputs_t5.encoder_hidden_states[-2]
+ cond_embeds = outputs_clip[0]
+
+ # For CLIP_T5, context_lens is None (not supported yet)
+ return encoder_hidden_states, cond_embeds, None
+ else:
+ raise ValueError(f"Unknown text_encoder_architecture: {text_encoder_architecture}")
+
+
+def process_image(image, size, Norm=False, hps_score = 6.0):
+ image = exif_transpose(image)
+
+ if not image.mode == "RGB":
+ image = image.convert("RGB")
+
+ orig_height = image.height
+ orig_width = image.width
+
+ image = transforms.Resize(size, interpolation=transforms.InterpolationMode.BILINEAR)(image)
+
+ c_top, c_left, _, _ = transforms.RandomCrop.get_params(image, output_size=(size, size))
+ image = transforms.functional.crop(image, c_top, c_left, size, size)
+ image = transforms.ToTensor()(image)
+
+ if Norm:
+ image = transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5], inplace=True)(image)
+
+ micro_conds = torch.tensor(
+ [orig_width, orig_height, c_top, c_left, hps_score],
+ )
+
+ return {"image": image, "micro_conds": micro_conds}
+
+
+class MyParquetDataset(Dataset):
+ def __init__(self, root_dir, tokenizer=None, size=512,
+ text_encoder_architecture='CLIP', norm=False):
+ random.seed(23)
+
+ self.root_dir = root_dir
+ self.dataset_receipt = {'MSCOCO_part1': {'total_num': 6212, 'ratio':1}, 'MSCOCO_part2': {'total_num': 6212, 'ratio':1}}
+
+ self.tokenizer = tokenizer
+ self.size = size
+ self.text_encoder_architecture = text_encoder_architecture
+ self.norm = norm
+
+ self.hdfs = fs.HadoopFileSystem(host="", port=0000) # TODO: change to your own HDFS host and port
+ self._init_mixed_parquet_dir_list()
+
+ self.file_metadata = []
+ self.cumulative_sizes = [0]
+ total = 0
+ for path in self.parquet_files:
+ try:
+ with pq.ParquetFile(path, filesystem=self.hdfs) as pf:
+ num_rows = pf.metadata.num_rows
+ self.file_metadata.append({
+ 'path': path,
+ 'num_rows': num_rows,
+ 'global_offset': total
+ })
+ total += num_rows
+ self.cumulative_sizes.append(total)
+ except Exception as e:
+ print(f"Error processing {path}: {str(e)}")
+ continue
+
+ # init cache
+ self.current_file = None
+ self.cached_data = None
+ self.cached_file_index = -1
+
+ def _init_mixed_parquet_dir_list(self):
+ print('Loading parquet files, please be patient...')
+ self.parquet_files = []
+
+ for key, value in self.dataset_receipt.items():
+ # Generate a list of standard Parquet file paths, lazy load
+ hdfs_path = os.path.join(self.root_dir, key)
+
+ num = value['total_num']
+ sampled_list = random.sample(
+ [f"{hdfs_path}/train-{idx:05d}-of-{num:05d}.parquet" for idx in range(num)],
+ k=int(num * value['ratio'])
+ )
+ self.parquet_files += sampled_list
+
+ def __len__(self):
+ return self.cumulative_sizes[-1]
+
+ def _locate_file(self, global_idx):
+ # Use binary search to quickly locate files
+ file_index = bisect.bisect_right(self.cumulative_sizes, global_idx) - 1
+ if file_index < 0 or file_index >= len(self.file_metadata):
+ raise IndexError(f"Index {global_idx} out of range")
+
+ file_info = self.file_metadata[file_index]
+ local_idx = global_idx - file_info['global_offset']
+ return file_index, local_idx
+
+ def _load_file(self, file_index):
+ """Load Parquet files into cache on demand"""
+ if self.cached_file_index != file_index:
+ file_info = self.file_metadata[file_index]
+ try:
+ table = pq.read_table(file_info['path'], filesystem=self.hdfs)
+ self.cached_data = table.to_pydict()
+ self.cached_file_index = file_index
+ except Exception as e:
+ print(f"Error loading {file_info['path']}: {str(e)}")
+ raise
+
+ def __getitem__(self, idx):
+ file_index, local_idx = self._locate_file(idx)
+ self._load_file(file_index)
+ sample = {k: v[local_idx] for k, v in self.cached_data.items()}
+
+ # cprint(sample.keys(), 'red')
+ generated_caption, image_path = sample['task2'], sample['image'] # only suitable for my data
+ instance_image = Image.open(io.BytesIO(image_path['bytes']))
+
+ # if instance_image.width < self.size or instance_image.height < self.size:
+ # raise ValueError(f"Image at {image_path} is too small")
+
+ rv = process_image(instance_image, self.size, self.norm)
+
+ if isinstance(self.tokenizer, list):
+ _tmp_ = tokenize_prompt(self.tokenizer, generated_caption, self.text_encoder_architecture)
+ rv["prompt_input_ids"] = [_tmp_[0][0], _tmp_[1][0]]
+ rv["prompt_attention_mask"] = [_tmp_[0][1], _tmp_[1][1]]
+ else:
+ input_ids, attention_mask = tokenize_prompt(self.tokenizer, generated_caption, self.text_encoder_architecture)
+ rv["prompt_input_ids"] = input_ids
+ rv["prompt_attention_mask"] = attention_mask
+
+ return rv
+
+class HuggingFaceDataset(Dataset):
+ def __init__(
+ self,
+ hf_dataset,
+ tokenizer,
+ image_key,
+ prompt_key,
+ prompt_prefix=None,
+ size=512,
+ text_encoder_architecture='CLIP',
+ ):
+ self.size = size
+ self.image_key = image_key
+ self.prompt_key = prompt_key
+ self.tokenizer = tokenizer
+ self.hf_dataset = hf_dataset
+ self.prompt_prefix = prompt_prefix
+ self.text_encoder_architecture = text_encoder_architecture
+
+ def __len__(self):
+ return len(self.hf_dataset)
+
+ def __getitem__(self, index):
+ item = self.hf_dataset[index]
+
+ rv = process_image(item[self.image_key], self.size)
+
+ prompt = item[self.prompt_key]
+
+ if self.prompt_prefix is not None:
+ prompt = self.prompt_prefix + prompt
+
+ if isinstance(self.tokenizer, list):
+ _tmp_ = tokenize_prompt(self.tokenizer, prompt, self.text_encoder_architecture)
+ rv["prompt_input_ids"] = [_tmp_[0][0],_tmp_[1][0]]
+ rv["prompt_attention_mask"] = [_tmp_[0][1],_tmp_[1][1]]
+ else:
+ input_ids, attention_mask = tokenize_prompt(self.tokenizer, prompt, self.text_encoder_architecture)
+ rv["prompt_input_ids"] = input_ids
+ rv["prompt_attention_mask"] = attention_mask
+
+ return rv
+
+
+def process_video(video_tensor, num_frames, height, width, use_random_crop=True):
+ """
+ Process video tensor for training.
+
+ Uses aspect-ratio preserving resize + crop to avoid distortion.
+
+ Args:
+ video_tensor: Video tensor of shape [C, F, H, W] or [F, H, W, C]
+ num_frames: Target number of frames
+ height: Target height
+ width: Target width
+ use_random_crop: If True, use random crop (for training). If False, use center crop (for validation/feature extraction)
+
+ Returns:
+ Processed video tensor of shape [C, F, H, W] in [0, 1] range
+ """
+ # Ensure video is in [C, F, H, W] format
+ if video_tensor.dim() == 4:
+ if video_tensor.shape[0] == 3 or video_tensor.shape[0] == 1:
+ # Already in [C, F, H, W] format
+ pass
+ elif video_tensor.shape[-1] == 3 or video_tensor.shape[-1] == 1:
+ # [F, H, W, C] -> [C, F, H, W]
+ video_tensor = video_tensor.permute(3, 0, 1, 2)
+ else:
+ raise ValueError(f"Unexpected video tensor shape: {video_tensor.shape}")
+
+ # Normalize to [0, 1] if needed
+ if video_tensor.max() > 1.0:
+ video_tensor = video_tensor / 255.0
+
+ C, F, H, W = video_tensor.shape
+
+ # Temporal resampling: ensure exactly num_frames frames
+ if F != num_frames:
+ if F < num_frames:
+ # If video is shorter, pad by repeating the last frame
+ num_pad = num_frames - F
+ last_frame = video_tensor[:, -1:, :, :] # [C, 1, H, W]
+ padding = last_frame.repeat(1, num_pad, 1, 1) # [C, num_pad, H, W]
+ video_tensor = torch.cat([video_tensor, padding], dim=1) # [C, num_frames, H, W]
+ F = num_frames
+ else:
+ # If video is longer, randomly select a continuous segment of num_frames
+ max_start = F - num_frames
+ start_idx = random.randint(0, max_start)
+ indices = torch.arange(start_idx, start_idx + num_frames)
+ video_tensor = video_tensor[:, indices, :, :]
+ F = num_frames # Update F after temporal resampling
+
+ # Spatial resizing: aspect-ratio preserving resize + crop
+ if H != height or W != width:
+ # Step 1: Aspect-ratio preserving resize
+ # Calculate scale factors for both dimensions
+ scale_h = height / H
+ scale_w = width / W
+
+ # Use the larger scale to ensure both dimensions are at least as large as target
+ # This way, after resize, we can crop to exact target size
+ scale = max(scale_h, scale_w)
+
+ # Calculate new dimensions maintaining aspect ratio
+ new_H = int(H * scale)
+ new_W = int(W * scale)
+
+ # Ensure we have at least the target size (handle rounding)
+ if new_H < height:
+ new_H = height
+ if new_W < width:
+ new_W = width
+
+ # Resize maintaining aspect ratio
+ # Process each frame: [C, F, H, W] -> reshape to [C*F, 1, H, W] for interpolation
+ video_tensor = torch.nn.functional.interpolate(
+ video_tensor.reshape(C * F, 1, H, W),
+ size=(new_H, new_W),
+ mode='bilinear',
+ align_corners=False
+ ).reshape(C, F, new_H, new_W)
+
+ # Step 2: Crop to target size (height, width)
+ # Calculate crop coordinates
+ if use_random_crop:
+ # Random crop for training (data augmentation)
+ max_h = new_H - height
+ max_w = new_W - width
+ if max_h < 0 or max_w < 0:
+ # If resized image is smaller than target, pad instead
+ pad_h = max(0, height - new_H)
+ pad_w = max(0, width - new_W)
+ video_tensor = torch.nn.functional.pad(
+ video_tensor,
+ (pad_w // 2, pad_w - pad_w // 2, pad_h // 2, pad_h - pad_h // 2),
+ mode='constant',
+ value=0
+ )
+ # If still not exact size, crop or pad
+ if video_tensor.shape[2] != height or video_tensor.shape[3] != width:
+ video_tensor = torch.nn.functional.interpolate(
+ video_tensor.reshape(C * F, 1, video_tensor.shape[2], video_tensor.shape[3]),
+ size=(height, width),
+ mode='bilinear',
+ align_corners=False
+ ).reshape(C, F, height, width)
+ else:
+ crop_h = random.randint(0, max_h)
+ crop_w = random.randint(0, max_w)
+ video_tensor = video_tensor[:, :, crop_h:crop_h + height, crop_w:crop_w + width]
+ else:
+ # Center crop for validation/feature extraction (deterministic)
+ crop_h = (new_H - height) // 2
+ crop_w = (new_W - width) // 2
+ if crop_h < 0 or crop_w < 0:
+ # If resized image is smaller than target, pad instead
+ pad_h = max(0, height - new_H)
+ pad_w = max(0, width - new_W)
+ video_tensor = torch.nn.functional.pad(
+ video_tensor,
+ (pad_w // 2, pad_w - pad_w // 2, pad_h // 2, pad_h - pad_h // 2),
+ mode='constant',
+ value=0
+ )
+ # If still not exact size, crop or pad
+ if video_tensor.shape[2] != height or video_tensor.shape[3] != width:
+ video_tensor = torch.nn.functional.interpolate(
+ video_tensor.reshape(C * F, 1, video_tensor.shape[2], video_tensor.shape[3]),
+ size=(height, width),
+ mode='bilinear',
+ align_corners=False
+ ).reshape(C, F, height, width)
+ else:
+ video_tensor = video_tensor[:, :, crop_h:crop_h + height, crop_w:crop_w + width]
+
+ # Final verification: ensure output has exactly the expected shape
+ C, F, H, W = video_tensor.shape
+ assert F == num_frames, f"Frame count mismatch: expected {num_frames}, got {F}"
+ assert H == height, f"Height mismatch: expected {height}, got {H}"
+ assert W == width, f"Width mismatch: expected {width}, got {W}"
+
+ return video_tensor
+
+
+class VideoDataset(Dataset):
+ """
+ Dataset for video training, compatible with HuggingFace datasets format.
+ Supports OpenVid1M and similar video-text datasets.
+ """
+ def __init__(
+ self,
+ hf_dataset,
+ tokenizer,
+ video_key="video",
+ prompt_key="caption",
+ prompt_prefix=None,
+ num_frames=16,
+ height=480,
+ width=848,
+ text_encoder_architecture='umt5-base',
+ use_random_crop=True, # Random crop for training, center crop for validation
+ ):
+ self.hf_dataset = hf_dataset
+ self.tokenizer = tokenizer
+ self.video_key = video_key
+ self.prompt_key = prompt_key
+ self.prompt_prefix = prompt_prefix
+ self.num_frames = num_frames
+ self.height = height
+ self.width = width
+ self.text_encoder_architecture = text_encoder_architecture
+ self.use_random_crop = use_random_crop
+
+ def __len__(self):
+ return len(self.hf_dataset)
+
+ def __getitem__(self, index):
+ item = self.hf_dataset[index]
+
+ # Load video
+ video = item[self.video_key]
+
+ # Convert to tensor if needed (handle different formats)
+ if isinstance(video, list):
+ # List of PIL Images or tensors
+ frames = []
+ for frame in video:
+ if isinstance(frame, Image.Image):
+ frame = transforms.ToTensor()(frame)
+ frames.append(frame)
+ video_tensor = torch.stack(frames, dim=1) # [C, F, H, W]
+ elif isinstance(video, torch.Tensor):
+ video_tensor = video
+ else:
+ raise ValueError(f"Unsupported video type: {type(video)}")
+
+ # Process video
+ video_tensor = process_video(video_tensor, self.num_frames, self.height, self.width)
+
+ # Ensure video tensor has exactly the expected shape
+ C, F, H, W = video_tensor.shape
+ if F != self.num_frames or H != self.height or W != self.width:
+ # If shape doesn't match, create a properly sized tensor
+ video_tensor = torch.nn.functional.interpolate(
+ video_tensor.reshape(C * F, 1, H, W),
+ size=(self.height, self.width),
+ mode='bilinear',
+ align_corners=False
+ ).reshape(C, F, self.height, self.width)
+ # Ensure exactly num_frames
+ if F < self.num_frames:
+ # Pad by repeating last frame
+ num_pad = self.num_frames - F
+ last_frame = video_tensor[:, -1:, :, :]
+ padding = last_frame.repeat(1, num_pad, 1, 1)
+ video_tensor = torch.cat([video_tensor, padding], dim=1)
+ elif F > self.num_frames:
+ # Crop to num_frames
+ video_tensor = video_tensor[:, :self.num_frames, :, :]
+
+ # Clone to ensure storage is resizable (required for DataLoader collate)
+ video_tensor = video_tensor.contiguous().clone()
+
+ # Process prompt
+ prompt = item[self.prompt_key]
+ if self.prompt_prefix is not None:
+ prompt = self.prompt_prefix + prompt
+
+ prompt_input_ids, prompt_attention_mask = tokenize_prompt(self.tokenizer, prompt, self.text_encoder_architecture)
+ # Clone to ensure storage is resizable
+ prompt_input_ids = prompt_input_ids.clone()
+ prompt_attention_mask = prompt_attention_mask.clone()
+
+ rv = {
+ "video": video_tensor, # [C, num_frames, height, width], guaranteed shape
+ "prompt_input_ids": prompt_input_ids,
+ "prompt_attention_mask": prompt_attention_mask
+ }
+
+ return rv
+
+
+class OpenVid1MDataset(Dataset):
+ """
+ Dataset for OpenVid1M video-text pairs from CSV file.
+
+ CSV format:
+ video,caption,aesthetic score,motion score,temporal consistency score,camera motion,frame,fps,seconds,new_id
+
+ Returns:
+ dict with keys:
+ - "video": torch.Tensor of shape [C, F, H, W] in [0, 1] range
+ - "prompt_input_ids": torch.Tensor of tokenized prompt
+ """
+ def __init__(
+ self,
+ csv_path,
+ video_root_dir,
+ tokenizer,
+ num_frames=16,
+ height=480,
+ width=848,
+ text_encoder_architecture='umt5-base',
+ prompt_prefix=None,
+ use_random_temporal_crop=True, # If False, always sample from the beginning
+ use_random_crop=True, # Random crop for training, center crop for validation/feature extraction
+ ):
+ """
+ Args:
+ csv_path: Path to the CSV file containing video metadata
+ video_root_dir: Root directory where video files are stored
+ tokenizer: Text tokenizer
+ num_frames: Target number of frames to extract
+ height: Target height
+ width: Target width
+ text_encoder_architecture: Architecture of text encoder
+ prompt_prefix: Optional prefix to add to prompts
+ """
+ self.csv_path = csv_path
+ self.video_root_dir = video_root_dir
+ self.tokenizer = tokenizer
+ self.num_frames = num_frames
+ self.height = height
+ self.width = width
+ self.text_encoder_architecture = text_encoder_architecture
+ self.prompt_prefix = prompt_prefix
+ self.use_random_temporal_crop = use_random_temporal_crop
+ self.use_random_crop = use_random_crop
+
+ # Load CSV data
+ self.data = []
+ with open(csv_path, 'r', encoding='utf-8') as f:
+ reader = csv.DictReader(f)
+ for row in reader:
+ self.data.append(row)
+
+ logger.info(f"Loaded {len(self.data)} video entries from {csv_path}")
+
+ # Try to import video loading library
+ self.video_loader = None
+ try:
+ import decord
+ decord.bridge.set_bridge('torch')
+ self.video_loader = 'decord'
+ logger.info("Using decord for video loading")
+ except ImportError:
+ try:
+ import av
+ self.video_loader = 'av'
+ logger.info("Using PyAV for video loading")
+ except ImportError:
+ try:
+ import cv2
+ self.video_loader = 'cv2'
+ logger.info("Using OpenCV for video loading")
+ except ImportError:
+ raise ImportError(
+ "No video loading library found. Please install one of: "
+ "decord (pip install decord), PyAV (pip install av), or opencv-python (pip install opencv-python)"
+ )
+
+ def __len__(self):
+ return len(self.data)
+
+ def _load_video_decord(self, video_path):
+ """Load video using decord"""
+ import decord
+ vr = decord.VideoReader(video_path, ctx=decord.cpu(0))
+ total_frames = len(vr)
+
+ # Sample frames: random temporal crop (continuous segment) for better temporal coherence
+ if total_frames <= self.num_frames:
+ indices = list(range(total_frames))
+ else:
+ if self.use_random_temporal_crop:
+ # Randomly select a continuous segment of num_frames
+ max_start = total_frames - self.num_frames
+ start_idx = random.randint(0, max_start)
+ else:
+ # Fixed sampling: always start from the beginning
+ start_idx = 0
+ indices = list(range(start_idx, start_idx + self.num_frames))
+
+ frames = vr.get_batch(indices) # [F, H, W, C] in uint8
+ # If using torch bridge, frames is already a torch Tensor
+ if isinstance(frames, torch.Tensor):
+ frames = frames.float() # [F, H, W, C]
+ else:
+ # Use torch.tensor() instead of torch.from_numpy() to ensure a complete copy
+ # This avoids "Trying to resize storage that is not resizable" errors in DataLoader collate
+ frames = torch.tensor(frames, dtype=torch.float32) # [F, H, W, C], fully copied
+ frames = frames.permute(3, 0, 1, 2) # [C, F, H, W]
+ frames = frames / 255.0 # Normalize to [0, 1]
+
+ return frames
+
+ def _load_video_av(self, video_path):
+ """Load video using PyAV"""
+ import av
+ container = av.open(video_path)
+ frames = []
+
+ # Get video stream
+ video_stream = container.streams.video[0]
+ total_frames = video_stream.frames if video_stream.frames > 0 else None
+
+ # Sample frames: random temporal crop (continuous segment) for better temporal coherence
+ if total_frames is None:
+ # If we can't get frame count, decode all frames and sample
+ frame_list = []
+ for frame in container.decode(video_stream):
+ frame_list.append(frame)
+ total_frames = len(frame_list)
+ if total_frames <= self.num_frames:
+ frame_indices = list(range(total_frames))
+ else:
+ if self.use_random_temporal_crop:
+ # Randomly select a continuous segment of num_frames
+ max_start = total_frames - self.num_frames
+ start_idx = random.randint(0, max_start)
+ else:
+ # Fixed sampling: always start from the beginning
+ start_idx = 0
+ frame_indices = list(range(start_idx, start_idx + self.num_frames))
+ frames = [transforms.ToTensor()(frame_list[i].to_image()) for i in frame_indices]
+ else:
+ if total_frames <= self.num_frames:
+ frame_indices = list(range(total_frames))
+ else:
+ if self.use_random_temporal_crop:
+ # Randomly select a continuous segment of num_frames
+ max_start = total_frames - self.num_frames
+ start_idx = random.randint(0, max_start)
+ else:
+ # Fixed sampling: always start from the beginning
+ start_idx = 0
+ frame_indices = list(range(start_idx, start_idx + self.num_frames))
+
+ frame_idx = 0
+ for frame in container.decode(video_stream):
+ if frame_idx in frame_indices:
+ img = frame.to_image() # PIL Image
+ img_tensor = transforms.ToTensor()(img) # [C, H, W]
+ frames.append(img_tensor)
+ if len(frames) >= self.num_frames:
+ break
+ frame_idx += 1
+
+ container.close()
+
+ if len(frames) == 0:
+ raise ValueError(f"No frames extracted from {video_path}")
+
+ # Stack frames: [C, F, H, W]
+ video_tensor = torch.stack(frames, dim=1)
+
+ # Pad if needed
+ if video_tensor.shape[1] < self.num_frames:
+ padding = torch.zeros(
+ video_tensor.shape[0],
+ self.num_frames - video_tensor.shape[1],
+ video_tensor.shape[2],
+ video_tensor.shape[3]
+ )
+ video_tensor = torch.cat([video_tensor, padding], dim=1)
+
+ return video_tensor
+
+ def _load_video_cv2(self, video_path):
+ """Load video using OpenCV"""
+ import cv2
+ cap = cv2.VideoCapture(video_path)
+ frames = []
+
+ total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
+
+ # Sample frames: random temporal crop (continuous segment) for better temporal coherence
+ if total_frames <= self.num_frames:
+ frame_indices = list(range(total_frames))
+ else:
+ if self.use_random_temporal_crop:
+ # Randomly select a continuous segment of num_frames
+ max_start = total_frames - self.num_frames
+ start_idx = random.randint(0, max_start)
+ else:
+ # Fixed sampling: always start from the beginning
+ start_idx = 0
+ frame_indices = list(range(start_idx, start_idx + self.num_frames))
+
+ frame_idx = 0
+ while True:
+ ret, frame = cap.read()
+ if not ret:
+ break
+ if frame_idx in frame_indices:
+ # Convert BGR to RGB
+ frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
+ # Convert to tensor [C, H, W] and normalize to [0, 1]
+ # Use torch.tensor() instead of torch.from_numpy() to ensure a complete copy
+ # This avoids "Trying to resize storage that is not resizable" errors in DataLoader collate
+ frame_tensor = torch.tensor(frame_rgb, dtype=torch.float32).permute(2, 0, 1) / 255.0
+ frames.append(frame_tensor)
+ if len(frames) >= self.num_frames:
+ break
+ frame_idx += 1
+
+ cap.release()
+
+ if len(frames) == 0:
+ raise ValueError(f"No frames extracted from {video_path}")
+
+ # Stack frames: [C, F, H, W]
+ video_tensor = torch.stack(frames, dim=1)
+
+ # Pad if needed
+ if video_tensor.shape[1] < self.num_frames:
+ padding = torch.zeros(
+ video_tensor.shape[0],
+ self.num_frames - video_tensor.shape[1],
+ video_tensor.shape[2],
+ video_tensor.shape[3]
+ )
+ video_tensor = torch.cat([video_tensor, padding], dim=1)
+
+ return video_tensor
+
+ def _load_video(self, video_path):
+ """Load video from path using the available video loader"""
+ full_path = os.path.join(self.video_root_dir, video_path)
+
+ if not os.path.exists(full_path):
+ raise FileNotFoundError(f"Video file not found: {full_path}")
+
+ if self.video_loader == 'decord':
+ return self._load_video_decord(full_path)
+ elif self.video_loader == 'av':
+ return self._load_video_av(full_path)
+ elif self.video_loader == 'cv2':
+ return self._load_video_cv2(full_path)
+ else:
+ raise ValueError(f"Unknown video loader: {self.video_loader}")
+
+ def __getitem__(self, index):
+ row = self.data[index]
+
+ # Load video
+ video_path = row['video']
+ try:
+ video_tensor = self._load_video(video_path)
+ except Exception as e:
+ # If video loading fails, return a zero tensor and log error
+ logger.warning(f"Failed to load video {video_path}: {e}")
+ video_tensor = torch.zeros(3, self.num_frames, self.height, self.width)
+
+ # Process video: aspect-ratio preserving resize + crop to target dimensions
+ video_tensor = process_video(video_tensor, self.num_frames, self.height, self.width, use_random_crop=self.use_random_crop)
+
+ # Ensure video tensor has exactly the expected shape
+ C, F, H, W = video_tensor.shape
+ if F != self.num_frames or H != self.height or W != self.width:
+ # If shape doesn't match, create a properly sized tensor
+ video_tensor = torch.nn.functional.interpolate(
+ video_tensor.reshape(C * F, 1, H, W),
+ size=(self.height, self.width),
+ mode='bilinear',
+ align_corners=False
+ ).reshape(C, F, self.height, self.width)
+ # Ensure exactly num_frames
+ if F < self.num_frames:
+ # Pad by repeating last frame
+ num_pad = self.num_frames - F
+ last_frame = video_tensor[:, -1:, :, :]
+ padding = last_frame.repeat(1, num_pad, 1, 1)
+ video_tensor = torch.cat([video_tensor, padding], dim=1)
+ elif F > self.num_frames:
+ # Crop to num_frames
+ video_tensor = video_tensor[:, :self.num_frames, :, :]
+
+ # Clone to ensure storage is resizable (required for DataLoader collate)
+ video_tensor = video_tensor.contiguous().clone()
+
+ # Process prompt
+ prompt = row['caption']
+ if self.prompt_prefix is not None:
+ prompt = self.prompt_prefix + prompt
+
+ prompt_input_ids, prompt_attention_mask = tokenize_prompt(self.tokenizer, prompt, self.text_encoder_architecture)
+ # Clone to ensure storage is resizable
+ prompt_input_ids = prompt_input_ids.clone()
+ prompt_attention_mask = prompt_attention_mask.clone()
+
+ return {
+ "video": video_tensor, # [C, num_frames, height, width], guaranteed shape
+ "prompt_input_ids": prompt_input_ids,
+ "prompt_attention_mask": prompt_attention_mask
+ }
+
+
+class TinyOpenVid1MDataset(OpenVid1MDataset):
+ """
+ A tiny subset of OpenVid1MDataset for overfitting experiments.
+ Only takes the first N samples from the full dataset.
+ """
+ def __init__(
+ self,
+ csv_path,
+ video_root_dir=None,
+ tokenizer=None,
+ num_frames=16,
+ height=480,
+ width=848,
+ text_encoder_architecture='umt5-base',
+ prompt_prefix=None,
+ max_samples=256, # Only use first N samples
+ seed=42, # Fixed seed for reproducibility
+ ):
+ """
+ Args:
+ max_samples: Maximum number of samples to use (default: 256)
+ seed: Random seed for reproducibility (default: 42)
+ """
+ # Initialize parent class
+ super().__init__(
+ csv_path=csv_path,
+ video_root_dir=video_root_dir,
+ tokenizer=tokenizer,
+ num_frames=num_frames,
+ height=height,
+ width=width,
+ text_encoder_architecture=text_encoder_architecture,
+ prompt_prefix=prompt_prefix,
+ )
+
+ # Limit to first max_samples
+ original_len = len(self.data)
+ if original_len > max_samples:
+ # Use fixed seed to ensure reproducibility
+ import random
+ random.seed(seed)
+ # Shuffle with fixed seed, then take first max_samples
+ indices = list(range(original_len))
+ random.shuffle(indices)
+ self.data = [self.data[i] for i in indices[:max_samples]]
+ logger.info(f"Limited dataset to {max_samples} samples (from {original_len} total) for overfitting experiment")
+ else:
+ logger.info(f"Using all {len(self.data)} samples (less than max_samples={max_samples})")
+
+
+def get_hierarchical_path(base_dir, index):
+ """
+ Get hierarchical path for loading features from 3-level directory structure.
+
+ Structure: base_dir/level1/level2/level3/filename.npy
+ - level1: index // 1000000 (0-999)
+ - level2: (index // 1000) % 1000 (0-999)
+ - level3: index % 1000 (0-999)
+
+ Args:
+ base_dir: Base directory for features
+ index: Sample index
+
+ Returns:
+ Full path to the file
+ """
+ level1 = index // 1000000
+ level2 = (index // 1000) % 1000
+ level3 = index % 1000
+
+ file_path = os.path.join(
+ base_dir,
+ f"{level1:03d}",
+ f"{level2:03d}",
+ f"{level3:03d}",
+ f"{index:08d}.npy"
+ )
+
+ return file_path
+
+
+class PrecomputedVideoOnlyDataset(Dataset):
+ """
+ Dataset for loading pre-extracted video codes only, with text processed at runtime.
+
+ This dataset loads video codes that were pre-extracted by extract_features.py,
+ but requires a tokenizer for text processing during training.
+
+ Features are stored in a 3-level hierarchical directory structure:
+ - video_codes/level1/level2/level3/index.npy
+ - text data comes from metadata (captions)
+ """
+
+ def __init__(
+ self,
+ features_dir,
+ tokenizer=None,
+ text_encoder_architecture='umt5-xxl',
+ prompt_prefix=None,
+ num_samples=None,
+ start_index=0,
+ ):
+ """
+ Args:
+ features_dir: Directory containing extracted features (should have video_codes/ subdir and metadata.json)
+ tokenizer: Text tokenizer for processing prompts at runtime
+ text_encoder_architecture: Text encoder architecture
+ prompt_prefix: Optional prefix to add to prompts
+ num_samples: Number of samples to use. If None, use all available samples.
+ start_index: Starting index for samples (for resuming or subset selection)
+ """
+ self.features_dir = features_dir
+ self.video_codes_dir = os.path.join(features_dir, "video_codes")
+ self.tokenizer = tokenizer
+ self.text_encoder_architecture = text_encoder_architecture
+ self.prompt_prefix = prompt_prefix
+ self.metadata_file = os.path.join(features_dir, "metadata.json")
+
+ # Load metadata
+ if os.path.exists(self.metadata_file):
+ import json
+ with open(self.metadata_file, 'r') as f:
+ self.metadata = json.load(f)
+ logger.info(f"Loaded metadata from {self.metadata_file}")
+ logger.info(f" Total samples in metadata: {self.metadata.get('num_samples', 'unknown')}")
+
+ # Get available indices from metadata
+ if 'samples' in self.metadata and len(self.metadata['samples']) > 0:
+ self.samples_data = {s['index']: s for s in self.metadata['samples']}
+ available_indices = sorted(self.samples_data.keys())
+ else:
+ # Fallback: infer from directory structure
+ available_indices = self._scan_hierarchical_directory(self.video_codes_dir)
+ else:
+ raise FileNotFoundError(f"Metadata file not found: {self.metadata_file}. Required for PrecomputedVideoOnlyDataset.")
+
+ # Filter by start_index and num_samples
+ available_indices = [idx for idx in available_indices if idx >= start_index]
+ if num_samples is not None:
+ available_indices = available_indices[:num_samples]
+
+ self.indices = available_indices
+ logger.info(f"PrecomputedVideoOnlyDataset: {len(self.indices)} samples available")
+ if len(self.indices) > 0:
+ logger.info(f" Index range: {min(self.indices)} to {max(self.indices)}")
+
+ def _scan_hierarchical_directory(self, base_dir):
+ """
+ Scan hierarchical directory structure to find all available indices.
+
+ Args:
+ base_dir: Base directory to scan
+
+ Returns:
+ List of available indices
+ """
+ available_indices = []
+
+ if not os.path.exists(base_dir):
+ raise FileNotFoundError(f"Directory not found: {base_dir}")
+
+ # Scan level1 directories (000-999)
+ for level1 in range(1000):
+ level1_dir = os.path.join(base_dir, f"{level1:03d}")
+ if not os.path.exists(level1_dir):
+ continue
+
+ # Scan level2 directories (000-999)
+ for level2 in range(1000):
+ level2_dir = os.path.join(level1_dir, f"{level2:03d}")
+ if not os.path.exists(level2_dir):
+ continue
+
+ # Scan level3 directories (000-999)
+ for level3 in range(1000):
+ level3_dir = os.path.join(level2_dir, f"{level3:03d}")
+ if not os.path.exists(level3_dir):
+ continue
+
+ # List all .npy files in level3 directory
+ for filename in os.listdir(level3_dir):
+ if filename.endswith('.npy'):
+ try:
+ index = int(filename.replace('.npy', ''))
+ available_indices.append(index)
+ except ValueError:
+ continue
+
+ return sorted(available_indices)
+
+ def __len__(self):
+ return len(self.indices)
+
+ def __getitem__(self, idx):
+ sample_idx = self.indices[idx]
+
+ # Get hierarchical paths for video codes
+ video_code_path = get_hierarchical_path(self.video_codes_dir, sample_idx)
+
+ # Load video codes
+ if not os.path.exists(video_code_path):
+ raise FileNotFoundError(f"Video code not found: {video_code_path}")
+ video_codes_np = np.load(video_code_path) # [F', H', W']
+ # Use torch.tensor() instead of torch.from_numpy() to ensure a complete copy
+ video_codes = torch.tensor(video_codes_np, dtype=torch.int32) # CPU tensor, int32, fully copied
+ del video_codes_np # Release numpy array reference
+
+ # Get text data from metadata
+ if sample_idx not in self.samples_data:
+ raise ValueError(f"Sample {sample_idx} not found in metadata")
+
+ sample_meta = self.samples_data[sample_idx]
+ caption = sample_meta.get('caption', '')
+ if not caption:
+ # Try alternative field names
+ caption = sample_meta.get('text', '') or sample_meta.get('prompt', '')
+
+ if not caption:
+ raise ValueError(f"No caption found for sample {sample_idx}")
+
+ # Apply prompt prefix if specified
+ if self.prompt_prefix is not None:
+ caption = self.prompt_prefix + caption
+
+ # Tokenize text
+ prompt_input_ids, prompt_attention_mask = tokenize_prompt(self.tokenizer, caption, self.text_encoder_architecture)
+ # Clone to ensure storage is resizable
+ if isinstance(prompt_input_ids, torch.Tensor) and prompt_input_ids.ndim == 2 and prompt_input_ids.shape[0] == 1:
+ prompt_input_ids = prompt_input_ids[0]
+ if isinstance(prompt_attention_mask, torch.Tensor) and prompt_attention_mask.ndim == 2 and prompt_attention_mask.shape[0] == 1:
+ prompt_attention_mask = prompt_attention_mask[0]
+
+ prompt_input_ids = prompt_input_ids.contiguous().clone()
+ prompt_attention_mask = prompt_attention_mask.contiguous().clone()
+
+ return {
+ "video_codes": video_codes, # [F', H', W'], CPU tensor, int32
+ "prompt_input_ids": prompt_input_ids, # Tokenized text
+ "prompt_attention_mask": prompt_attention_mask, # Attention mask
+ "sample_index": sample_idx,
+ }
+
+
+class PrecomputedFeatureDataset(Dataset):
+ """
+ Dataset for loading pre-extracted video codes and text embeddings.
+
+ This dataset loads features that were pre-extracted by extract_features.py,
+ avoiding the need to encode videos and text during training.
+
+ Features are stored in a 3-level hierarchical directory structure:
+ - video_codes/level1/level2/level3/index.npy
+ - text_embeddings/level1/level2/level3/index.npy
+ """
+
+ def __init__(
+ self,
+ features_dir,
+ num_samples=None,
+ start_index=0,
+ ):
+ """
+ Args:
+ features_dir: Directory containing extracted features (should have video_codes/ and text_embeddings/ subdirs)
+ num_samples: Number of samples to use. If None, use all available samples.
+ start_index: Starting index for samples (for resuming or subset selection)
+ """
+ self.features_dir = features_dir
+ self.video_codes_dir = os.path.join(features_dir, "video_codes")
+ self.text_embeddings_dir = os.path.join(features_dir, "text_embeddings")
+ self.metadata_file = os.path.join(features_dir, "metadata.json")
+
+ # Load metadata
+ if os.path.exists(self.metadata_file):
+ import json
+ with open(self.metadata_file, 'r') as f:
+ self.metadata = json.load(f)
+ logger.info(f"Loaded metadata from {self.metadata_file}")
+ logger.info(f" Total samples in metadata: {self.metadata.get('num_samples', 'unknown')}")
+
+ # Get available indices from metadata
+ if 'samples' in self.metadata and len(self.metadata['samples']) > 0:
+ available_indices = sorted([s['index'] for s in self.metadata['samples']])
+ else:
+ # Fallback: infer from directory structure
+ available_indices = self._scan_hierarchical_directory(self.video_codes_dir)
+ else:
+ # If no metadata, scan directory structure
+ logger.warning(f"Metadata file not found: {self.metadata_file}, scanning directory structure")
+ self.metadata = {}
+ available_indices = self._scan_hierarchical_directory(self.video_codes_dir)
+
+ # Filter by start_index and num_samples
+ available_indices = [idx for idx in available_indices if idx >= start_index]
+ if num_samples is not None:
+ available_indices = available_indices[:num_samples]
+
+ self.indices = available_indices
+ logger.info(f"PrecomputedFeatureDataset: {len(self.indices)} samples available")
+ if len(self.indices) > 0:
+ logger.info(f" Index range: {min(self.indices)} to {max(self.indices)}")
+
+ def _scan_hierarchical_directory(self, base_dir):
+ """
+ Scan hierarchical directory structure to find all available indices.
+
+ Args:
+ base_dir: Base directory to scan
+
+ Returns:
+ List of available indices
+ """
+ available_indices = []
+
+ if not os.path.exists(base_dir):
+ raise FileNotFoundError(f"Directory not found: {base_dir}")
+
+ # Scan level1 directories (000-999)
+ for level1 in range(1000):
+ level1_dir = os.path.join(base_dir, f"{level1:03d}")
+ if not os.path.exists(level1_dir):
+ continue
+
+ # Scan level2 directories (000-999)
+ for level2 in range(1000):
+ level2_dir = os.path.join(level1_dir, f"{level2:03d}")
+ if not os.path.exists(level2_dir):
+ continue
+
+ # Scan level3 directories (000-999)
+ for level3 in range(1000):
+ level3_dir = os.path.join(level2_dir, f"{level3:03d}")
+ if not os.path.exists(level3_dir):
+ continue
+
+ # List all .npy files in level3 directory
+ for filename in os.listdir(level3_dir):
+ if filename.endswith('.npy'):
+ try:
+ index = int(filename.replace('.npy', ''))
+ available_indices.append(index)
+ except ValueError:
+ continue
+
+ return sorted(available_indices)
+
+ def __len__(self):
+ return len(self.indices)
+
+ def __getitem__(self, idx):
+ sample_idx = self.indices[idx]
+
+ # Get hierarchical paths
+ video_code_path = get_hierarchical_path(self.video_codes_dir, sample_idx)
+ text_embedding_path = get_hierarchical_path(self.text_embeddings_dir, sample_idx)
+
+ # Load video codes
+ # Note: We load directly (not mmap) to avoid storage sharing issues with torch
+ # The files are small enough (video codes are int32, typically < 1MB per sample)
+ if not os.path.exists(video_code_path):
+ raise FileNotFoundError(f"Video code not found: {video_code_path}")
+ video_codes_np = np.load(video_code_path) # [F', H', W']
+ # Use torch.tensor() instead of torch.from_numpy() to ensure a complete copy
+ # This avoids "Trying to resize storage that is not resizable" errors in DataLoader collate
+ video_codes = torch.tensor(video_codes_np, dtype=torch.int32) # CPU tensor, int32, fully copied
+ del video_codes_np # Release numpy array reference
+
+ # Load text embedding
+ # Note: We load directly (not mmap) to avoid storage sharing issues with torch
+ if not os.path.exists(text_embedding_path):
+ raise FileNotFoundError(f"Text embedding not found: {text_embedding_path}")
+ text_embedding_np = np.load(text_embedding_path) # [L, D]
+ # Use torch.tensor() instead of torch.from_numpy() to ensure a complete copy
+ # Preserve original dtype (should be float16 from extraction)
+ text_embedding_dtype = torch.float16 if text_embedding_np.dtype == np.float16 else torch.float32
+ text_embedding = torch.tensor(text_embedding_np, dtype=text_embedding_dtype) # CPU tensor, fully copied
+ del text_embedding_np # Release numpy array reference
+
+ # Get context length from metadata if available
+ context_len = None
+ if 'samples' in self.metadata:
+ # Find the sample metadata
+ sample_meta = next((s for s in self.metadata['samples'] if s['index'] == sample_idx), None)
+ if sample_meta and 'context_len' in sample_meta:
+ context_len = sample_meta['context_len']
+
+ return {
+ "video_codes": video_codes, # [F', H', W'], CPU tensor, int32
+ "text_embedding": text_embedding, # [L, D], CPU tensor, float16/bfloat16
+ "sample_index": sample_idx,
+ "context_len": context_len, # Effective text length (int or None)
+ }
\ No newline at end of file
diff --git a/Meissonic/train/extrach_check_missing.sh b/Meissonic/train/extrach_check_missing.sh
new file mode 100644
index 0000000000000000000000000000000000000000..8dbfdc6ea5d2a2513754447253d1f635f510faad
--- /dev/null
+++ b/Meissonic/train/extrach_check_missing.sh
@@ -0,0 +1,4 @@
+python train/extract_check_missing.py \
+ --csv /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv \
+ --root /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_256_256_full_set \
+ --world_size 8 --processes 8
diff --git a/Meissonic/train/extract.sh b/Meissonic/train/extract.sh
new file mode 100644
index 0000000000000000000000000000000000000000..0c43948d5634466d02a3a96ce74b1a019f2c2f1b
--- /dev/null
+++ b/Meissonic/train/extract.sh
@@ -0,0 +1,67 @@
+# mv /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128_full_set /opt/dlami/nvme
+
+
+# accelerate launch --multi_gpu --gpu_ids '0,1,2,3,4,5,6,7' --main_process_port 25011 --num_processes 8 \
+# train/extract_features.py \
+# --csv_path /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv \
+# --output_dir /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_256_256_full_set \
+# --text_encoder_architecture umt5-xxl \
+# --video_tokenizer_model_id Cosmos-0.1-Tokenizer-DV4x8x8 \
+# --num_frames 17 \
+# --video_height 256 \
+# --video_width 256 \
+# --batch_size 64 \
+# --num_workers 8 \
+# --extract_text \
+# --extract_video \
+# --save_attention_mask \
+# --skip_existing
+
+accelerate launch --multi_gpu --gpu_ids '0,1,2,3,4,5,6,7' --main_process_port 25012 --num_processes 8 \
+ train/extract_features.py \
+ --csv_path /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv \
+ --output_dir /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_256_256_full_set \
+ --text_encoder_architecture umt5-xxl \
+ --video_tokenizer_model_id Cosmos-0.1-Tokenizer-DV4x8x8 \
+ --num_frames 17 \
+ --video_height 256 \
+ --video_width 256 \
+ --batch_size 64 \
+ --num_workers 8 \
+ --extract_text --extract_video --save_attention_mask \
+ --skip_existing \
+ --index_file /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_256_256_full_set/missing_process_{rank}.txt \
+ --run_tag resume \
+ --merge_world_size 8
+
+
+
+# python train/extract_empty_embeds.py \
+# --text_encoder_architecture umt5-base \
+# --output_path /path/to/empty_embeds.pt \
+# --dtype float16
+
+
+# python train/train_mei_video.py \
+# --use_precomputed_features \
+# --features_dir /path/to/extracted_features \
+# --text_encoder_architecture umt5-base \
+# --video_tokenizer_model_id Cosmos-1.0-Tokenizer-DV8x16x16 \
+# --num_frames 16 \
+# --video_height 480 \
+# --video_width 848 \
+# --train_batch_size 8 \
+# --learning_rate 3e-4 \
+# --max_train_steps 10000 \
+# --output_dir ./output \
+# --mixed_precision bf16
+
+
+# python train/check_codebook_range.py \
+# --csv_path /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv \
+# --video_tokenizer_model_id Cosmos-0.1-Tokenizer-DV4x8x8 \
+# --num_frames 16 \
+# --video_height 480 \
+# --video_width 848 \
+# --check_interval 10 \
+# --max_samples 1000 # 可选:限制检查的样本数
\ No newline at end of file
diff --git a/Meissonic/train/extract_check_missing.py b/Meissonic/train/extract_check_missing.py
new file mode 100644
index 0000000000000000000000000000000000000000..a94a35cf8443280e50d3cc63013c12544534ed6b
--- /dev/null
+++ b/Meissonic/train/extract_check_missing.py
@@ -0,0 +1,206 @@
+
+import argparse
+import json
+from pathlib import Path
+
+def count_csv_rows(csv_path: Path) -> int:
+ # 只数行数(减去表头),不读入内存
+ n = 0
+ with csv_path.open("rb") as f:
+ for _ in f:
+ n += 1
+ return max(0, n - 1)
+
+def iter_samples_from_pos(f, initial_buf: str, stop_on_trunc=True):
+ """
+ 从 '"samples": [' 后面开始,做 brace-matching 流式提取每个 { ... } 对象。
+ 即使文件末尾截断,也会尽量返回已完整解析的对象;末尾不完整对象会被丢弃。
+ """
+ in_string = False
+ escape = False
+ depth = 0
+ collecting = False
+ obj_chars = []
+
+ def feed(ch: str):
+ nonlocal in_string, escape, depth, collecting, obj_chars
+ if in_string:
+ if collecting:
+ obj_chars.append(ch)
+ if escape:
+ escape = False
+ else:
+ if ch == "\\":
+ escape = True
+ elif ch == '"':
+ in_string = False
+ return None
+
+ # not in string
+ if ch == '"':
+ in_string = True
+ if collecting:
+ obj_chars.append(ch)
+ return None
+
+ # samples array end
+ if not collecting and ch == "]":
+ return "__END__"
+
+ if ch == "{":
+ if not collecting:
+ collecting = True
+ depth = 1
+ obj_chars = ["{"]
+ else:
+ depth += 1
+ obj_chars.append("{")
+ return None
+
+ if collecting:
+ obj_chars.append(ch)
+ if ch == "{":
+ depth += 1
+ elif ch == "}":
+ depth -= 1
+ if depth == 0:
+ s = "".join(obj_chars)
+ collecting = False
+ obj_chars = []
+ try:
+ return json.loads(s)
+ except json.JSONDecodeError:
+ # 这一般只会发生在对象本身被截断/污染
+ return "__BAD_OBJECT__"
+ return None
+
+ def consume(text: str):
+ for ch in text:
+ out = feed(ch)
+ if out == "__END__":
+ return "__END__"
+ if isinstance(out, dict):
+ yield out
+ # bad object: skip
+ return None
+
+ endflag = yield from consume(initial_buf)
+ if endflag == "__END__":
+ return
+
+ while True:
+ chunk = f.read(1024 * 1024)
+ if not chunk:
+ break
+ endflag = yield from consume(chunk)
+ if endflag == "__END__":
+ return
+
+ # 文件结束:如果还在 collecting,说明截断;丢弃最后不完整对象
+ return
+
+def iter_samples_salvage(meta_path: Path):
+ """
+ 容错读取 metadata_process_i.json 中 samples 数组里的每条 sample。
+ 如果文件截断,仍尽量读出前面完整部分。
+ """
+ with meta_path.open("r", encoding="utf-8-sig") as f:
+ buf = ""
+ found = False
+
+ # 找到 '"samples"' 后的 '['
+ while True:
+ chunk = f.read(1024 * 1024)
+ if not chunk:
+ break
+ buf += chunk
+ k = buf.find('"samples"')
+ if k != -1:
+ b = buf.find("[", k)
+ if b != -1:
+ buf = buf[b + 1 :] # 从 '[' 后开始
+ found = True
+ break
+ # 控制 buf 不无限增长
+ if len(buf) > 8 * 1024 * 1024:
+ buf = buf[-4 * 1024 * 1024 :]
+
+ if not found:
+ return # 没找到 samples(文件太坏/格式不同)
+
+ yield from iter_samples_from_pos(f, buf)
+
+def main():
+ ap = argparse.ArgumentParser()
+ ap.add_argument("--csv", required=True, help="OpenVid1M_reorganized.csv 路径")
+ ap.add_argument("--root", required=True, help="extracted_features_* 根目录(含 metadata_process_*.json)")
+ ap.add_argument("--world_size", type=int, default=8)
+ ap.add_argument("--processes", type=int, default=8)
+ args = ap.parse_args()
+
+ csv_path = Path(args.csv)
+ root = Path(args.root)
+ assert csv_path.exists(), f"CSV not found: {csv_path}"
+ assert root.exists(), f"root not found: {root}"
+
+ N = count_csv_rows(csv_path)
+ print(f"CSV rows (N, 0-based indices 0..N-1): {N}")
+ print(f"world_size={args.world_size}, processes={args.processes}")
+ print("-" * 80)
+
+ all_missing = []
+
+ for r in range(args.processes):
+ meta = root / f"metadata_process_{r}.json"
+ if not meta.exists():
+ print(f"[rank {r}] metadata missing file: {meta}")
+ continue
+
+ seen = set()
+ total_samples_parsed = 0
+ bad_or_missing_index = 0
+
+ for s in iter_samples_salvage(meta):
+ total_samples_parsed += 1
+ idx = s.get("index", None)
+ if idx is None:
+ bad_or_missing_index += 1
+ continue
+ try:
+ idx = int(idx)
+ except Exception:
+ bad_or_missing_index += 1
+ continue
+ seen.add(idx)
+
+ # 理论应有 index:i, i+world_size, ...
+ exp_count = 0
+ missing = []
+ if r < N:
+ for idx in range(r, N, args.world_size):
+ exp_count += 1
+ if idx not in seen:
+ missing.append(idx)
+
+ out_txt = root / f"missing_process_{r}.txt"
+ out_txt.write_text("\n".join(map(str, missing)) + ("\n" if missing else ""), encoding="utf-8")
+
+ all_missing.extend(missing)
+
+ # 一些诊断信息:如果 parsed 样本数远小于 exp_count,基本就是 metadata 截断/没写完
+ coverage = (len(seen) / exp_count * 100.0) if exp_count > 0 else 0.0
+ print(
+ f"[rank {r}] expected={exp_count:,} parsed_samples={total_samples_parsed:,} "
+ f"unique_index={len(seen):,} idx_bad={bad_or_missing_index:,} "
+ f"missing={len(missing):,} coverage={coverage:.2f}% -> {out_txt.name}"
+ )
+
+ all_missing = sorted(set(all_missing))
+ out_all = root / "missing_all.txt"
+ out_all.write_text("\n".join(map(str, all_missing)) + ("\n" if all_missing else ""), encoding="utf-8")
+
+ print("-" * 80)
+ print(f"TOTAL missing unique indices vs CSV = {len(all_missing):,} -> {out_all}")
+
+if __name__ == "__main__":
+ main()
\ No newline at end of file
diff --git a/Meissonic/train/extract_empty_embeds.py b/Meissonic/train/extract_empty_embeds.py
new file mode 100644
index 0000000000000000000000000000000000000000..b28c7daec8f506d57ec0af32c64cf3d75dce137b
--- /dev/null
+++ b/Meissonic/train/extract_empty_embeds.py
@@ -0,0 +1,134 @@
+#!/usr/bin/env python3
+"""
+Extract and save empty_embeds for conditional dropout.
+
+This script extracts the empty embedding (from empty string prompt)
+and saves it to a file that can be loaded during training with precomputed features.
+"""
+
+import argparse
+import os
+import json
+import torch
+from transformers import T5EncoderModel, T5Tokenizer
+from dataset_utils import tokenize_prompt, encode_prompt
+import logging
+
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+
+
+def parse_args():
+ parser = argparse.ArgumentParser(description="Extract empty embeddings for conditional dropout")
+
+ parser.add_argument(
+ "--text_encoder_architecture",
+ type=str,
+ default="umt5-base",
+ choices=["umt5-base", "umt5-xxl", "t5"],
+ help="Text encoder architecture",
+ )
+ parser.add_argument(
+ "--output_path",
+ type=str,
+ required=True,
+ help="Path to save the empty_embeds (will save as .pt file and metadata as .json)",
+ )
+ parser.add_argument(
+ "--device",
+ type=str,
+ default="cuda" if torch.cuda.is_available() else "cpu",
+ help="Device to use for encoding",
+ )
+ parser.add_argument(
+ "--dtype",
+ type=str,
+ default="float16",
+ choices=["float16", "bfloat16", "float32"],
+ help="Data type for saving embeddings",
+ )
+
+ return parser.parse_args()
+
+
+def main():
+ args = parse_args()
+
+ # Map architecture to model ID
+ if args.text_encoder_architecture == "umt5-base":
+ model_id = "google/umt5-base"
+ elif args.text_encoder_architecture == "umt5-xxl":
+ model_id = "google/umt5-xxl"
+ elif args.text_encoder_architecture == "t5":
+ model_id = "t5-base"
+ else:
+ raise ValueError(f"Unknown text encoder architecture: {args.text_encoder_architecture}")
+
+ # Map dtype
+ dtype_map = {
+ "float16": torch.float16,
+ "bfloat16": torch.bfloat16,
+ "float32": torch.float32,
+ }
+ dtype = dtype_map[args.dtype]
+
+ logger.info(f"Loading text encoder: {model_id}")
+ logger.info(f"Device: {args.device}, Dtype: {args.dtype}")
+
+ # Load text encoder and tokenizer
+ text_encoder = T5EncoderModel.from_pretrained(model_id)
+ tokenizer = T5Tokenizer.from_pretrained(model_id)
+
+ # Move to device and set dtype
+ text_encoder.to(device=args.device, dtype=dtype)
+ text_encoder.eval()
+ text_encoder.requires_grad_(False)
+
+ # Extract empty embedding
+ logger.info("Extracting empty embedding from empty string...")
+ with torch.no_grad():
+ empty_input_ids = tokenize_prompt(tokenizer, "", args.text_encoder_architecture)
+ empty_input_ids = empty_input_ids.to(args.device)
+
+ empty_embeds, cond_embeds = encode_prompt(
+ text_encoder,
+ empty_input_ids,
+ args.text_encoder_architecture
+ )
+
+ # Convert to CPU and target dtype
+ empty_embeds = empty_embeds.cpu().to(dtype)
+
+ logger.info(f"Empty embedding shape: {empty_embeds.shape}")
+ logger.info(f"Empty embedding dtype: {empty_embeds.dtype}")
+
+ # Save empty_embeds
+ output_dir = os.path.dirname(args.output_path)
+ if output_dir:
+ os.makedirs(output_dir, exist_ok=True)
+
+ # Save as .pt file
+ torch.save(empty_embeds, args.output_path)
+ logger.info(f"Saved empty_embeds to: {args.output_path}")
+
+ # Save metadata
+ metadata_path = args.output_path.replace('.pt', '.json')
+ metadata = {
+ "text_encoder_architecture": args.text_encoder_architecture,
+ "model_id": model_id,
+ "empty_embeds_shape": list(empty_embeds.shape),
+ "empty_embeds_dtype": str(empty_embeds.dtype),
+ "device": args.device,
+ "dtype": args.dtype,
+ }
+
+ with open(metadata_path, 'w') as f:
+ json.dump(metadata, f, indent=2)
+ logger.info(f"Saved metadata to: {metadata_path}")
+
+ logger.info("Done!")
+
+
+if __name__ == "__main__":
+ main()
+
diff --git a/Meissonic/train/extract_features.py b/Meissonic/train/extract_features.py
new file mode 100644
index 0000000000000000000000000000000000000000..a2e64d81bc200e9d6d4f7633f870f4cedc0f6a21
--- /dev/null
+++ b/Meissonic/train/extract_features.py
@@ -0,0 +1,848 @@
+#!/usr/bin/env python3
+"""
+Extract video codes and text embeddings from video-text pairs for efficient training.
+
+Adds resume-by-index_file and robust merge:
+- --index_file: run only specified global indices (e.g. missing_process_0.txt)
+- --run_tag: write new per-rank metadata as metadata_process_{rank}.{tag}.json
+- Merge old possibly-truncated metadata_process_*.json + new tagged ones into metadata.json
+"""
+
+import argparse
+import os
+import sys
+import logging
+from tqdm import tqdm
+import torch
+import numpy as np
+from torch.utils.data import DataLoader, DistributedSampler, Sampler
+import json
+
+sys.path.append(os.getcwd())
+
+from train.dataset_utils import OpenVid1MDataset
+from src.pipeline_video import CosmosVideoTokenizer
+from transformers import T5Tokenizer, T5EncoderModel
+from accelerate import Accelerator
+
+logging.basicConfig(
+ format="%(asctime)s - %(levelname)s - %(name)s - %(message)s",
+ datefmt="%m/%d/%Y %H:%M:%S",
+ level=logging.INFO,
+)
+logger = logging.getLogger(__name__)
+
+
+# -----------------------------
+# IO helpers
+# -----------------------------
+def get_hierarchical_path(base_dir, index):
+ """
+ Structure: base_dir/level1/level2/level3/filename.npy
+ - level1: index // 1000000 (0-999)
+ - level2: (index // 1000) % 1000 (0-999)
+ - level3: index % 1000 (0-999)
+ """
+ level1 = index // 1000000
+ level2 = (index // 1000) % 1000
+ level3 = index % 1000
+ dir_path = os.path.join(base_dir, f"{level1:03d}", f"{level2:03d}", f"{level3:03d}")
+ file_path = os.path.join(dir_path, f"{index:08d}.npy")
+ return dir_path, file_path
+
+
+def atomic_save_npy(path: str, arr: np.ndarray):
+ os.makedirs(os.path.dirname(path), exist_ok=True)
+ tmp = path + ".tmp"
+ with open(tmp, "wb") as f:
+ np.save(f, arr)
+ os.replace(tmp, path)
+
+
+def atomic_save_json(path: str, obj, indent=2):
+ os.makedirs(os.path.dirname(path), exist_ok=True)
+ tmp = path + ".tmp"
+ with open(tmp, "w") as f:
+ json.dump(obj, f, indent=indent)
+ os.replace(tmp, path)
+
+
+def safe_mmap_shape(npy_path: str):
+ try:
+ arr = np.load(npy_path, mmap_mode="r")
+ return list(arr.shape)
+ except Exception:
+ return None
+
+
+def normalize_input_ids(x: torch.Tensor) -> torch.Tensor:
+ if x.ndim == 3:
+ if x.shape[1] == 1:
+ x = x.squeeze(1)
+ elif x.shape[2] == 1:
+ x = x.squeeze(2)
+ else:
+ raise ValueError(f"Unexpected input_ids shape: {tuple(x.shape)}")
+ elif x.ndim != 2:
+ raise ValueError(f"Unexpected input_ids ndim: {x.ndim}, shape={tuple(x.shape)}")
+ return x
+
+
+def get_feature_paths(args, video_codes_dir, text_embeddings_dir, attention_masks_dir, sample_idx: int):
+ paths = {}
+ if args.extract_video:
+ _, vp = get_hierarchical_path(video_codes_dir, sample_idx)
+ paths["video"] = vp
+ if args.extract_text:
+ _, tp = get_hierarchical_path(text_embeddings_dir, sample_idx)
+ paths["text"] = tp
+ if args.save_attention_mask:
+ _, ap = get_hierarchical_path(attention_masks_dir, sample_idx)
+ paths["mask"] = ap
+ return paths
+
+
+# -----------------------------
+# robust salvage for truncated JSON
+# -----------------------------
+def iter_samples_salvage(meta_path: str):
+ """
+ Read possibly-truncated metadata_process_*.json and salvage complete objects in "samples":[...].
+ """
+ p = meta_path
+ if not os.path.exists(p):
+ return
+ with open(p, "r", encoding="utf-8-sig") as f:
+ buf = ""
+ found = False
+ while True:
+ chunk = f.read(1024 * 1024)
+ if not chunk:
+ break
+ buf += chunk
+ k = buf.find('"samples"')
+ if k != -1:
+ b = buf.find("[", k)
+ if b != -1:
+ buf = buf[b + 1 :]
+ found = True
+ break
+ if len(buf) > 8 * 1024 * 1024:
+ buf = buf[-4 * 1024 * 1024 :]
+ if not found:
+ return
+
+ in_string = False
+ escape = False
+ depth = 0
+ collecting = False
+ obj = []
+
+ def feed(ch):
+ nonlocal in_string, escape, depth, collecting, obj
+ if in_string:
+ if collecting:
+ obj.append(ch)
+ if escape:
+ escape = False
+ else:
+ if ch == "\\":
+ escape = True
+ elif ch == '"':
+ in_string = False
+ return None
+
+ if ch == '"':
+ in_string = True
+ if collecting:
+ obj.append(ch)
+ return None
+
+ if not collecting and ch == "]":
+ return "__END__"
+
+ if ch == "{":
+ if not collecting:
+ collecting = True
+ depth = 1
+ obj = ["{"]
+ else:
+ depth += 1
+ obj.append("{")
+ return None
+
+ if collecting:
+ obj.append(ch)
+ if ch == "{":
+ depth += 1
+ elif ch == "}":
+ depth -= 1
+ if depth == 0:
+ s = "".join(obj)
+ collecting = False
+ obj = []
+ try:
+ return json.loads(s)
+ except Exception:
+ return "__BAD__"
+ return None
+
+ def consume(text):
+ for ch in text:
+ out = feed(ch)
+ if out == "__END__":
+ return "__END__"
+ if isinstance(out, dict):
+ yield out
+
+ end = yield from consume(buf)
+ if end == "__END__":
+ return
+
+ while True:
+ chunk = f.read(1024 * 1024)
+ if not chunk:
+ break
+ end = yield from consume(chunk)
+ if end == "__END__":
+ return
+
+
+def merge_metadata(output_dir: str, merge_world_size: int, run_tag: str):
+ """
+ Merge:
+ - old metadata_process_{r}.json (may be truncated -> salvage)
+ - new metadata_process_{r}.{run_tag}.json (assumed complete)
+ into metadata.json (index-dedup, prefer new).
+ """
+ outdir = output_dir
+ by_idx = {}
+
+ # old first
+ for r in range(merge_world_size):
+ old_p = os.path.join(outdir, f"metadata_process_{r}.json")
+ for s in iter_samples_salvage(old_p) or []:
+ idx = s.get("index", None)
+ if idx is None:
+ continue
+ by_idx[int(idx)] = s
+
+ # then new (override)
+ if run_tag:
+ for r in range(merge_world_size):
+ new_p = os.path.join(outdir, f"metadata_process_{r}.{run_tag}.json")
+ if not os.path.exists(new_p):
+ continue
+ try:
+ with open(new_p, "r") as f:
+ meta = json.load(f)
+ for s in meta.get("samples", []):
+ idx = s.get("index", None)
+ if idx is None:
+ continue
+ by_idx[int(idx)] = s
+ except Exception as e:
+ logger.warning(f"Failed to load new meta {new_p}: {e}")
+
+ samples = [by_idx[k] for k in sorted(by_idx.keys())]
+
+ merged = {
+ "num_extracted_metadata": len(samples),
+ "world_size_used": merge_world_size,
+ "samples": samples,
+ }
+
+ # try pull header info from any available new meta (rank0 preferred)
+ header_src = None
+ if run_tag:
+ p0 = os.path.join(outdir, f"metadata_process_0.{run_tag}.json")
+ if os.path.exists(p0):
+ header_src = p0
+ if header_src is None:
+ p0_old = os.path.join(outdir, "metadata_process_0.json")
+ if os.path.exists(p0_old):
+ header_src = p0_old
+
+ if header_src:
+ # best-effort load; if truncated, skip
+ try:
+ with open(header_src, "r") as f:
+ m0 = json.load(f)
+ for k in [
+ "extract_video","extract_text","text_encoder_architecture","video_tokenizer_model_id",
+ "codebook_size","mask_token_id","num_frames","video_height","video_width",
+ "prompt_prefix","text_dtype","save_attention_mask","empty_embeds_shape","empty_embeds_path",
+ "num_samples_original","resume_from_index","num_samples_this_run","num_attempted",
+ "num_extracted","num_failed","num_processes","ranks_seen"
+ ]:
+ if k in m0 and m0[k] is not None:
+ merged[k] = m0[k]
+ except Exception:
+ pass
+
+ metadata_file = os.path.join(outdir, "metadata.json")
+ # backup old
+ if os.path.exists(metadata_file):
+ bak = os.path.join(outdir, "metadata.json.bak")
+ try:
+ os.replace(metadata_file, bak)
+ logger.info(f"Backed up old metadata.json -> {bak}")
+ except Exception:
+ pass
+
+ atomic_save_json(metadata_file, merged, indent=2)
+ logger.info(f"[MERGE] Wrote {metadata_file}, samples={len(samples):,}")
+
+
+# -----------------------------
+# Sampler for index list
+# -----------------------------
+class IndexListSampler(Sampler):
+ def __init__(self, indices):
+ self.indices = list(indices)
+ def __iter__(self):
+ return iter(self.indices)
+ def __len__(self):
+ return len(self.indices)
+
+
+# -----------------------------
+# Args
+# -----------------------------
+def parse_args():
+ parser = argparse.ArgumentParser(description="Extract video codes and text embeddings")
+
+ parser.add_argument("--csv_path", type=str, required=True, help="Path to OpenVid1M CSV file")
+ parser.add_argument("--video_root_dir", type=str, default=None, help="Root directory containing video files. If None, auto-detect.")
+ parser.add_argument("--output_dir", type=str, required=True, help="Output directory to save extracted features")
+
+ parser.add_argument(
+ "--text_encoder_architecture",
+ type=str,
+ default="umt5-base",
+ choices=["umt5-base", "umt5-xxl", "t5"],
+ help="Text encoder architecture",
+ )
+ parser.add_argument(
+ "--video_tokenizer_model_id",
+ type=str,
+ default="Cosmos-1.0-Tokenizer-DV8x16x16",
+ help="HuggingFace model ID for Cosmos video tokenizer",
+ )
+ parser.add_argument("--num_frames", type=int, default=16, help="Number of frames per video")
+ parser.add_argument("--video_height", type=int, default=480, help="Video height")
+ parser.add_argument("--video_width", type=int, default=848, help="Video width")
+ parser.add_argument("--batch_size", type=int, default=4, help="Batch size for feature extraction")
+ parser.add_argument("--num_workers", type=int, default=4, help="Number of dataloader workers")
+ parser.add_argument("--max_samples", type=int, default=None, help="Max samples (for testing). If None, process all.")
+ parser.add_argument("--resume_from_index", type=int, default=0, help="Resume extraction from this index")
+ parser.add_argument("--prompt_prefix", type=str, default=None, help="Prefix to add to prompts")
+
+ parser.add_argument("--extract_video", action="store_true", default=False, help="Extract video codes")
+ parser.add_argument("--extract_text", action="store_true", default=False, help="Extract text embeddings")
+
+ parser.add_argument("--text_dtype", type=str, default="bf16", choices=["fp32", "fp16", "bf16"], help="Text encoder dtype")
+
+ parser.add_argument("--skip_existing", action="store_true", help="Skip samples whose feature .npy already exist.")
+ parser.add_argument("--overwrite", action="store_true", help="Overwrite existing .npy (disables skip_existing).")
+
+ group = parser.add_mutually_exclusive_group()
+ group.add_argument("--save_attention_mask", dest="save_attention_mask", action="store_true",
+ help="Save attention mask per sample (default: on).")
+ group.add_argument("--no_save_attention_mask", dest="save_attention_mask", action="store_false",
+ help="Do NOT save attention mask per sample.")
+ parser.set_defaults(save_attention_mask=True)
+
+ # resume / merge additions
+ parser.add_argument(
+ "--index_file",
+ type=str,
+ default=None,
+ help="Text file with one global sample index per line. "
+ "Can contain '{rank}' placeholder, e.g. missing_process_{rank}.txt"
+ )
+ parser.add_argument(
+ "--run_tag",
+ type=str,
+ default=None,
+ help="Tag for new per-rank metadata file: metadata_process_{rank}.{run_tag}.json"
+ )
+ parser.add_argument(
+ "--merge_world_size",
+ type=int,
+ default=None,
+ help="How many ranks to merge for final metadata.json. "
+ "Set to 8 for your OpenVid1M run even if you resume with 1 GPU."
+ )
+
+ return parser.parse_args()
+
+
+# -----------------------------
+# Main
+# -----------------------------
+def main():
+ args = parse_args()
+ accelerator = Accelerator()
+
+ rank = accelerator.process_index
+ world_size = accelerator.num_processes
+ logger.info(f"Process {rank}/{world_size} on device {accelerator.device}")
+
+ if accelerator.is_main_process:
+ os.makedirs(args.output_dir, exist_ok=True)
+ logger.info(f"Output directory: {args.output_dir}")
+ logger.info(f"Using {accelerator.num_processes} GPUs for parallel extraction")
+ logger.info(f"Extract video codes: {args.extract_video}")
+ logger.info(f"Extract text embeddings: {args.extract_text}")
+ logger.info(f"skip_existing={args.skip_existing}, overwrite={args.overwrite}, save_attention_mask={args.save_attention_mask}")
+ logger.info(f"index_file={args.index_file}, run_tag={args.run_tag}, merge_world_size={args.merge_world_size}")
+
+ if not args.extract_video and not args.extract_text:
+ raise ValueError("At least one feature type must be enabled. Use --extract_video and/or --extract_text.")
+
+ device = accelerator.device
+ dtype = torch.float32
+
+ # ---- text encoder/tokenizer
+ text_encoder = None
+ tokenizer = None
+ if args.extract_text:
+ logger.info(f"Loading text encoder: {args.text_encoder_architecture} with dtype {args.text_dtype}")
+ text_dtype_map = {"fp32": torch.float32, "fp16": torch.float16, "bf16": torch.bfloat16}
+ text_dtype = text_dtype_map[args.text_dtype]
+
+ if args.text_encoder_architecture == "umt5-base":
+ model_id = "google/umt5-base"
+ elif args.text_encoder_architecture == "umt5-xxl":
+ model_id = "google/umt5-xxl"
+ elif args.text_encoder_architecture == "t5":
+ model_id = "t5-base"
+ else:
+ raise ValueError(f"Unknown text encoder: {args.text_encoder_architecture}")
+
+ text_encoder = T5EncoderModel.from_pretrained(model_id, torch_dtype=text_dtype)
+ tokenizer = T5Tokenizer.from_pretrained(model_id)
+ text_encoder.to(device=device)
+ text_encoder.eval()
+ text_encoder.requires_grad_(False)
+
+ # empty_embeds on main process
+ if accelerator.is_main_process:
+ logger.info("Extracting empty_embeds for conditional dropout...")
+ with torch.no_grad():
+ empty = tokenizer("", return_tensors="pt", padding="max_length", max_length=512, truncation=True)
+ empty_ids = empty["input_ids"].to(device)
+ empty_mask = empty["attention_mask"].to(device)
+ outputs = text_encoder(input_ids=empty_ids, attention_mask=empty_mask)
+ empty_embeds = outputs.last_hidden_state # [1, 512, D]
+
+ empty_embeds_cpu = empty_embeds.cpu()
+ if empty_embeds_cpu.dtype == torch.bfloat16:
+ empty_embeds_cpu = empty_embeds_cpu.to(torch.float32)
+ empty_embeds_np = empty_embeds_cpu.numpy().astype(np.float16)
+ empty_embeds_path = os.path.join(args.output_dir, "empty_embeds.npy")
+ np.save(empty_embeds_path, empty_embeds_np)
+ logger.info(f"Saved empty_embeds to: {empty_embeds_path} shape={empty_embeds_np.shape} dtype={empty_embeds_np.dtype}")
+ else:
+ logger.info("Skipping text encoder loading (extract_text=False)")
+ if args.extract_video:
+ tokenizer = T5Tokenizer.from_pretrained("google/umt5-base")
+
+ # ---- video tokenizer
+ video_tokenizer = None
+ if args.extract_video:
+ logger.info(f"Loading video tokenizer: {args.video_tokenizer_model_id}")
+ video_tokenizer = CosmosVideoTokenizer(model_id=args.video_tokenizer_model_id, device=device, dtype=dtype)
+ video_tokenizer.requires_grad_(False)
+ video_tokenizer.eval()
+ else:
+ logger.info("Skipping video tokenizer loading (extract_video=False)")
+
+ # ---- auto-detect video_root_dir
+ if args.video_root_dir is None:
+ csv_dir = os.path.dirname(args.csv_path)
+ if os.path.exists(os.path.join(csv_dir, "video_reorg")):
+ video_root_dir = os.path.join(csv_dir, "video_reorg")
+ elif os.path.exists(os.path.join(os.path.dirname(csv_dir), "video_reorg")):
+ video_root_dir = os.path.join(os.path.dirname(csv_dir), "video_reorg")
+ else:
+ video_root_dir = csv_dir
+ logger.warning(f"Video directory not found, using CSV directory: {video_root_dir}")
+ else:
+ video_root_dir = args.video_root_dir
+
+ # ---- dataset
+ dataset = OpenVid1MDataset(
+ csv_path=args.csv_path,
+ video_root_dir=video_root_dir,
+ tokenizer=tokenizer,
+ num_frames=args.num_frames,
+ height=args.video_height,
+ width=args.video_width,
+ text_encoder_architecture=args.text_encoder_architecture,
+ prompt_prefix=args.prompt_prefix,
+ use_random_temporal_crop=False,
+ use_random_crop=False,
+ )
+
+ if args.max_samples is not None:
+ dataset.data = dataset.data[:args.max_samples]
+ logger.info(f"Limited dataset to {len(dataset)} samples")
+
+ if args.resume_from_index > 0:
+ dataset.data = dataset.data[args.resume_from_index:]
+ logger.info(f"Resuming from index {args.resume_from_index}, remaining samples: {len(dataset)}")
+
+ num_processes = accelerator.num_processes
+ process_index = accelerator.process_index
+
+ # ---- sampler: DistributedSampler OR IndexListSampler
+ if args.index_file is not None:
+ idx_path = args.index_file.format(rank=process_index)
+ with open(idx_path, "r") as f:
+ wanted_sample_idx = [int(x.strip()) for x in f if x.strip() and not x.strip().startswith("#")]
+
+ sampler_indices = []
+ for sample_idx in wanted_sample_idx:
+ global_dataset_idx = sample_idx - args.resume_from_index
+ if 0 <= global_dataset_idx < len(dataset.data):
+ sampler_indices.append(global_dataset_idx)
+
+ sampler = IndexListSampler(sampler_indices)
+ logger.info(f"[GPU {process_index}] Using index_file={idx_path}, indices={len(sampler_indices)}")
+ else:
+ sampler = DistributedSampler(
+ dataset,
+ num_replicas=num_processes,
+ rank=process_index,
+ shuffle=False,
+ drop_last=False,
+ )
+ sampler_indices = list(sampler)
+
+ dataloader = DataLoader(
+ dataset,
+ batch_size=args.batch_size,
+ sampler=sampler,
+ num_workers=args.num_workers,
+ pin_memory=True,
+ )
+
+ # ---- output dirs
+ video_codes_dir = None
+ text_embeddings_dir = None
+ attention_masks_dir = None
+ if args.extract_video:
+ video_codes_dir = os.path.join(args.output_dir, "video_codes")
+ os.makedirs(video_codes_dir, exist_ok=True)
+ if args.extract_text:
+ text_embeddings_dir = os.path.join(args.output_dir, "text_embeddings")
+ os.makedirs(text_embeddings_dir, exist_ok=True)
+ if args.save_attention_mask:
+ attention_masks_dir = os.path.join(args.output_dir, "attention_masks")
+ os.makedirs(attention_masks_dir, exist_ok=True)
+
+ # ---- load existing metadata shapes (optional)
+ metadata_file = os.path.join(args.output_dir, "metadata.json")
+ existing_shapes = {}
+ if os.path.exists(metadata_file):
+ try:
+ with open(metadata_file, "r") as f:
+ existing_meta = json.load(f)
+ for sample in existing_meta.get("samples", []):
+ idx = sample.get("index")
+ if idx is None:
+ continue
+ existing_shapes[int(idx)] = {
+ "video_code_shape": sample.get("video_code_shape"),
+ "text_embedding_shape": sample.get("text_embedding_shape"),
+ "context_len": sample.get("context_len"),
+ }
+ logger.info(f"[GPU {process_index}] Loaded existing metadata for {len(existing_shapes)} samples")
+ except Exception as e:
+ logger.warning(f"[GPU {process_index}] Failed to load existing metadata: {e}")
+
+ total_samples = len(dataset)
+ logger.info(f"[GPU {process_index}] Starting feature extraction for {total_samples} samples "
+ f"(process {process_index+1}/{num_processes}), assigned={len(sampler_indices)}")
+
+ # tokenizer info
+ codebook_size = None
+ mask_token_id = None
+ if args.extract_video and video_tokenizer is not None:
+ codebook_size = getattr(video_tokenizer, "codebook_size", None)
+ mask_token_id = getattr(video_tokenizer, "mask_token_id", None)
+ logger.info(f"[GPU {process_index}] Video tokenizer: codebook_size={codebook_size}, mask_token_id={mask_token_id}")
+
+ # empty embeds info
+ empty_embeds_shape = None
+ empty_embeds_path = os.path.join(args.output_dir, "empty_embeds.npy")
+ if args.extract_text and accelerator.is_main_process and os.path.exists(empty_embeds_path):
+ try:
+ empty_embeds_np = np.load(empty_embeds_path, mmap_mode="r")
+ empty_embeds_shape = list(empty_embeds_np.shape)
+ logger.info(f"Empty embeds shape: {empty_embeds_shape}")
+ except Exception:
+ pass
+
+ process_metadata = {
+ "process_index": process_index,
+ "num_samples_this_run": total_samples,
+ "world_size_used": world_size,
+ "rank_used": rank,
+ "extract_video": args.extract_video,
+ "extract_text": args.extract_text,
+ "text_encoder_architecture": args.text_encoder_architecture if args.extract_text else None,
+ "video_tokenizer_model_id": args.video_tokenizer_model_id if args.extract_video else None,
+ "codebook_size": codebook_size,
+ "mask_token_id": mask_token_id,
+ "num_frames": args.num_frames,
+ "video_height": args.video_height,
+ "video_width": args.video_width,
+ "prompt_prefix": args.prompt_prefix,
+ "text_dtype": args.text_dtype if args.extract_text else None,
+ "save_attention_mask": args.save_attention_mask,
+ "empty_embeds_shape": empty_embeds_shape if process_index == 0 else None,
+ "empty_embeds_path": "empty_embeds.npy" if args.extract_text else None,
+ "samples": [],
+ }
+
+ process_failed_samples = []
+ process_samples_processed = 0
+ process_attempted_samples = 0 # counts actual encoded+written attempts (post-skip)
+
+ with torch.no_grad():
+ for batch_idx, batch in enumerate(
+ tqdm(dataloader, desc=f"[GPU {process_index}] Extracting", disable=not accelerator.is_main_process)
+ ):
+ batch_size = batch["video"].shape[0] if args.extract_video else batch["prompt_input_ids"].shape[0]
+ local_start_idx = batch_idx * args.batch_size
+
+ # ---- compute sample indices for this batch (global sample_idx)
+ batch_sample_indices = []
+ for i in range(batch_size):
+ local_idx = local_start_idx + i
+ if local_idx < len(sampler_indices):
+ global_dataset_idx = sampler_indices[local_idx]
+ sample_idx = args.resume_from_index + global_dataset_idx
+ batch_sample_indices.append(sample_idx)
+ else:
+ batch_sample_indices.append(None)
+
+ # ---- compute output paths
+ batch_paths = []
+ for sidx in batch_sample_indices:
+ if sidx is None:
+ batch_paths.append(None)
+ else:
+ batch_paths.append(get_feature_paths(args, video_codes_dir, text_embeddings_dir, attention_masks_dir, sidx))
+
+ # ---- determine per-feature need flags
+ need_video = [False] * batch_size
+ need_text = [False] * batch_size
+ need_mask = [False] * batch_size
+
+ for i, (sidx, paths) in enumerate(zip(batch_sample_indices, batch_paths)):
+ if sidx is None:
+ continue
+
+ if args.overwrite:
+ need_video[i] = args.extract_video
+ need_text[i] = args.extract_text
+ need_mask[i] = args.extract_text and args.save_attention_mask
+ elif args.skip_existing:
+ if args.extract_video:
+ need_video[i] = not os.path.exists(paths["video"])
+ if args.extract_text:
+ need_text[i] = not os.path.exists(paths["text"])
+ if args.save_attention_mask:
+ need_mask[i] = not os.path.exists(paths["mask"])
+ else:
+ need_video[i] = args.extract_video
+ need_text[i] = args.extract_text
+ need_mask[i] = args.extract_text and args.save_attention_mask
+
+ need_any = [v or t or m for v, t, m in zip(need_video, need_text, need_mask)]
+
+ # IMPORTANT FIX:
+ # - normal full run: if no one needs anything, skip this batch
+ # - index_file resume: even if no extraction needed, still record metadata
+ if (not any(need_any)) and (args.index_file is None):
+ continue
+
+ process_attempted_samples += sum(need_any)
+
+ if batch_idx == 0:
+ preview = [x for x in batch_sample_indices[:5] if x is not None]
+ logger.info(f"[GPU {process_index}] First batch sample indices: {preview}")
+ logger.info(f"[GPU {process_index}] Need any in first batch: {sum(need_any)}/{len(need_any)}")
+
+ # ---- encode video for needed samples
+ video_codes = None
+ need_video_idx = [i for i, ok in enumerate(need_video) if ok]
+ map_video_pos = {i: p for p, i in enumerate(need_video_idx)}
+ if args.extract_video and len(need_video_idx) > 0:
+ videos = batch["video"].to(device, non_blocking=True)
+ videos_sel = videos[need_video_idx]
+ try:
+ vc_sel = video_tokenizer.encode(videos_sel)
+ video_codes = vc_sel.detach().cpu().numpy()
+ except Exception as e:
+ logger.error(f"[GPU {process_index}] Failed to encode video batch {batch_idx}: {e}")
+ for i in need_video_idx:
+ sidx = batch_sample_indices[i]
+ if sidx is not None:
+ process_failed_samples.append({"index": sidx, "reason": "video_encoding_failed"})
+ continue
+
+ # ---- text context lens + encode needed
+ encoder_hidden_states = None
+ attention_masks = None
+ context_lens_np_full = None
+
+ if args.extract_text:
+ prompt_input_ids = batch["prompt_input_ids"].to(device, non_blocking=True)
+ if isinstance(prompt_input_ids, (tuple, list)):
+ prompt_input_ids = prompt_input_ids[0]
+ prompt_input_ids = normalize_input_ids(prompt_input_ids).long()
+
+ pad_id = tokenizer.pad_token_id
+ attention_mask_full = (prompt_input_ids != pad_id).long()
+ context_lens_full = attention_mask_full.sum(dim=-1)
+ context_lens_np_full = context_lens_full.detach().cpu().numpy().astype(np.int32)
+
+ need_text_idx = [i for i, ok in enumerate(need_text) if ok]
+ map_text_pos = {i: p for p, i in enumerate(need_text_idx)}
+
+ if len(need_text_idx) > 0:
+ ids_sel = prompt_input_ids[need_text_idx]
+ mask_sel = attention_mask_full[need_text_idx]
+ try:
+ outputs = text_encoder(input_ids=ids_sel, attention_mask=mask_sel)
+ enc = outputs.last_hidden_state.detach().cpu()
+ if enc.dtype == torch.bfloat16:
+ enc = enc.to(torch.float32)
+ encoder_hidden_states = enc.numpy().astype(np.float16)
+ if args.save_attention_mask:
+ attention_masks = mask_sel.detach().cpu().numpy().astype(np.int32)
+ except Exception as e:
+ logger.error(f"[GPU {process_index}] Failed to encode text batch {batch_idx}: {e}")
+ for i in need_text_idx:
+ sidx = batch_sample_indices[i]
+ if sidx is not None:
+ process_failed_samples.append({"index": sidx, "reason": "text_encoding_failed"})
+ continue
+ else:
+ map_text_pos = {}
+ else:
+ need_text_idx = []
+ map_text_pos = {}
+
+ # ---- save per sample + record metadata
+ for i in range(batch_size):
+ local_idx = local_start_idx + i
+ if local_idx >= len(sampler_indices):
+ continue
+
+ global_dataset_idx = sampler_indices[local_idx]
+ sample_idx = args.resume_from_index + global_dataset_idx
+ paths = batch_paths[i]
+
+ row = dataset.data[global_dataset_idx] if global_dataset_idx < len(dataset.data) else None
+ if row is None:
+ continue
+
+ existing_info = existing_shapes.get(sample_idx, {})
+
+ video_code_shape = None
+ if args.extract_video:
+ if need_video[i]:
+ if video_codes is None:
+ process_failed_samples.append({"index": sample_idx, "reason": "video_codes_none"})
+ continue
+ pos = map_video_pos[i]
+ video_code = video_codes[pos].astype(np.int32)
+ try:
+ atomic_save_npy(paths["video"], video_code)
+ video_code_shape = list(video_code.shape)
+ except Exception as e:
+ process_failed_samples.append({"index": sample_idx, "reason": f"video_save_failed: {str(e)}"})
+ continue
+ else:
+ video_code_shape = existing_info.get("video_code_shape") or safe_mmap_shape(paths["video"])
+
+ text_embedding_shape = None
+ if args.extract_text:
+ if need_text[i]:
+ if encoder_hidden_states is None:
+ process_failed_samples.append({"index": sample_idx, "reason": "text_embeddings_none"})
+ continue
+ pos = map_text_pos[i]
+ text_emb = encoder_hidden_states[pos]
+ try:
+ atomic_save_npy(paths["text"], text_emb)
+ text_embedding_shape = list(text_emb.shape)
+ except Exception as e:
+ process_failed_samples.append({"index": sample_idx, "reason": f"text_save_failed: {str(e)}"})
+ continue
+
+ if args.save_attention_mask and need_mask[i]:
+ if attention_masks is None:
+ process_failed_samples.append({"index": sample_idx, "reason": "attention_masks_none"})
+ continue
+ try:
+ atomic_save_npy(paths["mask"], attention_masks[pos])
+ except Exception as e:
+ process_failed_samples.append({"index": sample_idx, "reason": f"mask_save_failed: {str(e)}"})
+ continue
+ else:
+ text_embedding_shape = existing_info.get("text_embedding_shape") or safe_mmap_shape(paths["text"])
+
+ sample_meta = {
+ "index": sample_idx,
+ "video_path": row.get("video", ""),
+ "caption": row.get("caption", ""),
+ }
+ if args.extract_video and video_code_shape is not None:
+ sample_meta["video_code_shape"] = video_code_shape
+ if args.extract_text and text_embedding_shape is not None:
+ sample_meta["text_embedding_shape"] = text_embedding_shape
+ if args.extract_text and context_lens_np_full is not None:
+ sample_meta["context_len"] = int(context_lens_np_full[i])
+
+ process_metadata["samples"].append(sample_meta)
+ process_samples_processed += 1
+
+ # periodic per-process metadata save (atomic, tagged)
+ if process_samples_processed > 0 and (process_samples_processed % 1000 == 0):
+ suffix = f".{args.run_tag}" if args.run_tag else ""
+ process_metadata_file = os.path.join(args.output_dir, f"metadata_process_{process_index}{suffix}.json")
+ process_metadata["num_extracted"] = process_samples_processed
+ process_metadata["failed_samples"] = process_failed_samples
+ atomic_save_json(process_metadata_file, process_metadata, indent=2)
+ logger.info(f"[GPU {process_index}] Progress: {process_samples_processed} samples recorded -> {process_metadata_file}")
+
+ accelerator.wait_for_everyone()
+
+ # final per-process metadata save
+ suffix = f".{args.run_tag}" if args.run_tag else ""
+ process_metadata_file = os.path.join(args.output_dir, f"metadata_process_{process_index}{suffix}.json")
+ process_metadata["num_attempted"] = int(process_attempted_samples)
+ process_metadata["num_extracted"] = int(process_samples_processed)
+ process_metadata["num_failed"] = int(len(process_failed_samples))
+ process_metadata["failed_samples"] = process_failed_samples
+ atomic_save_json(process_metadata_file, process_metadata, indent=2)
+
+ logger.info(f"[GPU {process_index}] Done: attempted={process_attempted_samples}, extracted(meta)={process_samples_processed}, failed={len(process_failed_samples)}")
+ accelerator.wait_for_everyone()
+
+ # merge on main process
+ if accelerator.is_main_process:
+ merge_world = args.merge_world_size if args.merge_world_size is not None else world_size
+ logger.info(f"[MERGE] merging world_size={merge_world} (run_tag={args.run_tag})")
+ merge_metadata(args.output_dir, merge_world_size=merge_world, run_tag=args.run_tag)
+
+
+if __name__ == "__main__":
+ main()
diff --git a/Meissonic/train/fix_extrach.py b/Meissonic/train/fix_extrach.py
new file mode 100644
index 0000000000000000000000000000000000000000..6138d05476d2fb9fac1ef88ebd4c25d9c75e01f2
--- /dev/null
+++ b/Meissonic/train/fix_extrach.py
@@ -0,0 +1,122 @@
+#!/usr/bin/env python3
+import os, json, glob, csv
+from pathlib import Path
+
+OUTDIR = "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_256_256_full_set"
+MERGED = f"{OUTDIR}/metadata.json"
+CSV_PATH = "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv"
+WORLD = 8
+
+def safe_json_load(p):
+ try:
+ with open(p, "r") as f:
+ return json.load(f)
+ except Exception:
+ return None
+
+def count_csv_rows(csv_path):
+ # robust row count excluding header
+ n = 0
+ with open(csv_path, "r", newline="") as f:
+ reader = csv.reader(f)
+ header = next(reader, None)
+ for _ in reader:
+ n += 1
+ return n
+
+def pick_header_source():
+ # prefer resume files
+ cand = []
+ for r in range(WORLD):
+ p = f"{OUTDIR}/metadata_process_{r}.resume.json"
+ if os.path.exists(p):
+ cand.append(p)
+ if cand:
+ return cand[0]
+ # fallback to old
+ for r in range(WORLD):
+ p = f"{OUTDIR}/metadata_process_{r}.json"
+ if os.path.exists(p):
+ # old may be truncated; but header is at top so json.load may still fail.
+ # try a few ranks
+ cand.append(p)
+ for p in cand:
+ m = safe_json_load(p)
+ if m is not None:
+ return p
+ return None
+
+def main():
+ merged = safe_json_load(MERGED)
+ assert merged is not None, f"Cannot load {MERGED}"
+
+ samples = merged.get("samples", [])
+ n_list = len(samples)
+ idxs = [s.get("index") for s in samples if isinstance(s, dict)]
+ idxs_int = [int(x) for x in idxs if x is not None]
+ n_unique = len(set(idxs_int))
+ n_dup = n_list - n_unique
+
+ print("---- Current merged metadata.json ----")
+ print(f"samples list length = {n_list:,}")
+ print(f"unique indices = {n_unique:,}")
+ print(f"duplicates in samples list = {n_dup:,}")
+
+ n_csv = count_csv_rows(CSV_PATH)
+ print("---- CSV ----")
+ print(f"CSV rows (expected total) = {n_csv:,}")
+
+ header_src = pick_header_source()
+ header = {}
+ if header_src is not None:
+ src = safe_json_load(header_src)
+ if src is not None:
+ header = src
+ print(f"---- Header source ----\n{header_src}")
+ else:
+ print("---- Header source ----\nNone (json.load failed), will use minimal header")
+ else:
+ print("---- Header source ----\nNone found, will use minimal header")
+
+ # build new header like your 128-version format
+ new_meta = {
+ "num_samples_original": int(header.get("num_samples_original", n_csv)),
+ "resume_from_index": int(header.get("resume_from_index", 0)),
+ "num_samples_this_run": int(header.get("num_samples_this_run", n_csv)),
+ "num_attempted": int(header.get("num_attempted", n_unique)), # fallback
+ "num_extracted": int(header.get("num_extracted", n_unique)), # fallback
+ "num_failed": int(header.get("num_failed", 0)),
+ "num_processes": int(header.get("num_processes", WORLD)),
+ "ranks_seen": header.get("ranks_seen", list(range(WORLD))),
+ "world_size_used": int(header.get("world_size_used", WORLD)),
+ "extract_video": bool(header.get("extract_video", True)),
+ "extract_text": bool(header.get("extract_text", True)),
+ "text_encoder_architecture": header.get("text_encoder_architecture", "umt5-xxl"),
+ "video_tokenizer_model_id": header.get("video_tokenizer_model_id", "Cosmos-0.1-Tokenizer-DV4x8x8"),
+ "codebook_size": header.get("codebook_size", 64000),
+ "mask_token_id": header.get("mask_token_id", 64000),
+ "num_frames": int(header.get("num_frames", 17)),
+ "video_height": int(header.get("video_height", 256)),
+ "video_width": int(header.get("video_width", 256)),
+ "prompt_prefix": header.get("prompt_prefix", None),
+ "text_dtype": header.get("text_dtype", "bf16"),
+ "save_attention_mask": bool(header.get("save_attention_mask", True)),
+ "empty_embeds_shape": header.get("empty_embeds_shape", [1, 512, 4096]),
+ "empty_embeds_path": header.get("empty_embeds_path", "empty_embeds.npy"),
+ "samples": samples,
+ }
+
+ # backup & write
+ bak = f"{MERGED}.bak"
+ if os.path.exists(MERGED) and not os.path.exists(bak):
+ os.replace(MERGED, bak)
+ print(f"Backup: {MERGED} -> {bak}")
+
+ tmp = f"{MERGED}.tmp"
+ with open(tmp, "w") as f:
+ json.dump(new_meta, f, indent=2)
+ os.replace(tmp, MERGED)
+ print(f"Wrote: {MERGED}")
+
+if __name__ == "__main__":
+ main()
diff --git a/Meissonic/train/infer_mei_video.py b/Meissonic/train/infer_mei_video.py
new file mode 100644
index 0000000000000000000000000000000000000000..b3ddc0baf12aeeae00c73c7c855dec6b6a063d00
--- /dev/null
+++ b/Meissonic/train/infer_mei_video.py
@@ -0,0 +1,497 @@
+#!/usr/bin/env python3
+"""
+Batch video generation inference script using the trained video diffusion model.
+
+This script supports processing multiple prompts in batch, with each prompt
+generating videos for all specified guidance scales.
+
+Usage:
+ python train/batch_infer_video.py --model_path /path/to/trained/model --output_dir ./output
+
+Example:
+ python train/infer_video.py \
+ --pretrained_model_name_or_path ./output/checkpoint-1000 \
+ --guidance_scales 1.0 3.0 5.0 7.0 9.0 11.0 \
+ --num_inference_steps 1000 \
+ --prompts "A person walking" "mountain" \
+ --output_dir ./batch_output
+"""
+
+import os
+import sys
+sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
+
+# Set memory optimization environment variables
+# os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True,max_split_size_mb:512"
+# os.environ["TORCH_USE_CUDA_DSA"] = "1"
+
+import argparse
+import torch
+from torchvision import transforms
+from torchvision.utils import save_image, make_grid
+from PIL import Image
+import numpy as np
+import time
+import logging
+
+from src.transformer_video import WanDiscreteVideoTransformer
+from src.pipeline_video import Pipeline, CosmosVideoTokenizer
+from src.scheduler_video import Scheduler
+from transformers import T5Tokenizer, T5EncoderModel
+
+logging.basicConfig(
+ format="%(asctime)s - %(levelname)s - %(message)s",
+ datefmt="%m/%d/%Y %H:%M:%S",
+ level=logging.INFO,
+)
+logger = logging.getLogger(__name__)
+
+
+def parse_args():
+ parser = argparse.ArgumentParser(description="Batch video generation inference")
+ parser.add_argument(
+ "--pretrained_model_name_or_path",
+ type=str,
+ required=True,
+ help="Path to trained model checkpoint (directory containing 'transformer' subfolder)",
+ )
+ parser.add_argument(
+ "--text_encoder_path",
+ type=str,
+ default="google/umt5-xxl",
+ help="Path or HuggingFace ID for T5 text encoder",
+ )
+ parser.add_argument(
+ "--video_tokenizer_model_id",
+ type=str,
+ default="Cosmos-0.1-Tokenizer-DV4x8x8",
+ help="HuggingFace model ID for Cosmos video tokenizer",
+ )
+ parser.add_argument(
+ "--num_frames",
+ type=int,
+ default=16,
+ help="Number of frames to generate",
+ )
+ parser.add_argument(
+ "--height",
+ type=int,
+ default=128,
+ help="Height of generated video",
+ )
+ parser.add_argument(
+ "--width",
+ type=int,
+ default=128,
+ help="Width of generated video",
+ )
+ parser.add_argument(
+ "--num_inference_steps",
+ type=int,
+ default=64,
+ help="Number of denoising steps",
+ )
+ parser.add_argument(
+ "--guidance_scale",
+ type=float,
+ default=9.0,
+ help="Classifier-free guidance scale (single value, ignored if --guidance_scales is provided)",
+ )
+ parser.add_argument(
+ "--guidance_scales",
+ type=float,
+ nargs="+",
+ default=None,
+ help="List of guidance scales to test (e.g., --guidance_scales 3.0 5.0 7.0 9.0). If provided, overrides --guidance_scale",
+ )
+ parser.add_argument(
+ "--output_dir",
+ type=str,
+ default="./generated_videos",
+ help="Directory to save generated videos",
+ )
+ parser.add_argument(
+ "--prompts",
+ type=str,
+ nargs="+",
+ required=True,
+ help="List of prompts for video generation",
+ )
+ parser.add_argument(
+ "--negative_prompt",
+ type=str,
+ default="worst quality, low quality, blurry, distortion, watermark, logo, text, jpeg artifacts",
+ help="Negative prompt for generation",
+ )
+ parser.add_argument(
+ "--seed",
+ type=int,
+ default=42,
+ help="Random seed for reproducibility",
+ )
+ parser.add_argument(
+ "--device",
+ type=str,
+ default="cuda",
+ help="Device to run inference on",
+ )
+ parser.add_argument(
+ "--dtype",
+ type=str,
+ default="bfloat16",
+ choices=["float32", "float16", "bfloat16"],
+ help="Data type for inference (bfloat16 recommended for memory efficiency)",
+ )
+ parser.add_argument(
+ "--save_frames",
+ action="store_true",
+ help="Save individual frames as images",
+ )
+ parser.add_argument(
+ "--save_grid",
+ action="store_true",
+ default=True,
+ help="Save frames as a grid image",
+ )
+ parser.add_argument(
+ "--save_mp4",
+ action="store_true",
+ help="Save video as MP4 file (requires imageio-ffmpeg)",
+ )
+ parser.add_argument(
+ "--fps",
+ type=int,
+ default=8,
+ help="Frames per second for video output (GIF/MP4)",
+ )
+ return parser.parse_args()
+
+
+def save_video_as_gif(frames, output_path, duration=100):
+ """Save video frames as GIF"""
+ if isinstance(frames[0], torch.Tensor):
+ frames = [transforms.ToPILImage()(f.clamp(0, 1)) for f in frames]
+ frames[0].save(
+ output_path,
+ save_all=True,
+ append_images=frames[1:],
+ duration=duration,
+ loop=0
+ )
+
+
+def save_video_frames(frames, output_dir, prefix="frame"):
+ """Save individual video frames as images"""
+ os.makedirs(output_dir, exist_ok=True)
+ for i, frame in enumerate(frames):
+ if isinstance(frame, torch.Tensor):
+ frame = transforms.ToPILImage()(frame.clamp(0, 1))
+ frame.save(os.path.join(output_dir, f"{prefix}_{i:04d}.png"))
+
+
+def save_video_as_mp4(frames, output_path, fps=8):
+ """Save video frames as MP4 file"""
+ try:
+ import imageio
+ import imageio_ffmpeg
+ except ImportError:
+ raise ImportError(
+ "imageio and imageio-ffmpeg are required for MP4 output. "
+ "Install with: pip install imageio imageio-ffmpeg"
+ )
+
+ # Convert frames to numpy arrays
+ if isinstance(frames[0], Image.Image):
+ frame_arrays = [np.array(frame) for frame in frames]
+ elif isinstance(frames[0], torch.Tensor):
+ frame_arrays = [
+ (frame.clamp(0, 1).permute(1, 2, 0).cpu().numpy() * 255).astype(np.uint8)
+ for frame in frames
+ ]
+ else:
+ frame_arrays = [np.array(frame) for frame in frames]
+
+ # Write MP4
+ imageio.mimwrite(output_path, frame_arrays, fps=fps, codec='libx264', quality=8)
+
+
+def main():
+ args = parse_args()
+
+ # Determine guidance scales to use
+ if args.guidance_scales is not None:
+ guidance_scales = args.guidance_scales
+ else:
+ guidance_scales = [args.guidance_scale]
+
+ # Determine if we're doing grid search
+ is_grid_search = len(guidance_scales) > 1
+ total_combinations = len(guidance_scales) * len(args.prompts)
+
+ print("=" * 80)
+ print("Batch Video Generation Inference")
+ print("=" * 80)
+ print(f"Number of prompts: {len(args.prompts)}")
+ print(f"Guidance scales: {guidance_scales}")
+ print(f"Total combinations: {total_combinations}")
+ print(f"Output directory: {args.output_dir}")
+ print(f"Model path: {args.pretrained_model_name_or_path}")
+
+ # Set device and dtype
+ device = torch.device(args.device if torch.cuda.is_available() else "cpu")
+ if args.dtype == "float16":
+ dtype = torch.float16
+ elif args.dtype == "bfloat16":
+ dtype = torch.bfloat16
+ else:
+ dtype = torch.float32
+
+ # Set random seed
+ if args.seed is not None:
+ torch.manual_seed(args.seed)
+ if torch.cuda.is_available():
+ torch.cuda.manual_seed_all(args.seed)
+ print(f"Random seed: {args.seed}")
+
+ # Create output directory
+ os.makedirs(args.output_dir, exist_ok=True)
+
+ print("=" * 80)
+ print("Loading Video Generation Pipeline")
+ print("=" * 80)
+
+ # 1. Load video tokenizer
+ print(f"\nLoading video tokenizer: {args.video_tokenizer_model_id}")
+ video_tokenizer = CosmosVideoTokenizer(
+ model_id=args.video_tokenizer_model_id,
+ device=device,
+ dtype=dtype
+ )
+ print("✓ Video tokenizer loaded")
+ print(f" Codebook size: {video_tokenizer.codebook_size}")
+ print(f" Compression factors: {video_tokenizer.t_downsample}x{video_tokenizer.h_downsample}x{video_tokenizer.w_downsample}")
+
+ # 2. Load T5 text encoder and tokenizer
+ print(f"\nLoading text encoder: {args.text_encoder_path}")
+ tokenizer = T5Tokenizer.from_pretrained(args.text_encoder_path)
+ # Load text encoder to specified device
+ text_encoder = T5EncoderModel.from_pretrained(
+ args.text_encoder_path,
+ torch_dtype=dtype,
+ ).to(device)
+ text_encoder.eval()
+ print(f"✓ Text encoder loaded (dim: {text_encoder.config.d_model})")
+
+ # 3. Calculate compressed dimensions
+ F_prime = args.num_frames // video_tokenizer.t_downsample
+ H_prime = args.height // video_tokenizer.h_downsample
+ W_prime = args.width // video_tokenizer.w_downsample
+ print(f"\nCompressed dimensions: F'={F_prime}, H'={H_prime}, W'={W_prime}")
+
+ # 4. Load transformer model
+ print(f"\nLoading transformer from: {args.pretrained_model_name_or_path}")
+ transformer_path = args.pretrained_model_name_or_path
+ if os.path.isdir(args.pretrained_model_name_or_path):
+ # Check for transformer subfolder
+ if os.path.exists(os.path.join(args.pretrained_model_name_or_path, "transformer")):
+ transformer_path = os.path.join(args.pretrained_model_name_or_path, "transformer")
+
+ try:
+ transformer = WanDiscreteVideoTransformer.from_pretrained(
+ transformer_path,
+ torch_dtype=dtype,
+ low_cpu_mem_usage=True,
+ ).to(device)
+ print("✓ Transformer loaded from pretrained")
+ except Exception as e:
+ print(f"Failed to load pretrained transformer: {e}")
+ print("Trying to load config and state dict separately...")
+
+ # Try loading from state dict
+ import json
+ config_path = os.path.join(transformer_path, "config.json")
+ if os.path.exists(config_path):
+ with open(config_path, 'r') as f:
+ config = json.load(f)
+
+ transformer = WanDiscreteVideoTransformer(**config)
+
+ # Load state dict
+ state_dict_path = os.path.join(transformer_path, "diffusion_pytorch_model.safetensors")
+ if not os.path.exists(state_dict_path):
+ state_dict_path = os.path.join(transformer_path, "diffusion_pytorch_model.bin")
+ if not os.path.exists(state_dict_path):
+ state_dict_path = os.path.join(transformer_path, "pytorch_model.bin")
+
+ if os.path.exists(state_dict_path):
+ if state_dict_path.endswith(".safetensors"):
+ from safetensors import safe_open
+ state_dict = {}
+ with safe_open(state_dict_path, framework="pt", device="cpu") as f:
+ for k in f.keys():
+ state_dict[k] = f.get_tensor(k)
+ else:
+ state_dict = torch.load(state_dict_path, map_location="cpu")
+ transformer.load_state_dict(state_dict, strict=False)
+ print(f"✓ Loaded state dict from {state_dict_path}")
+
+ transformer = transformer.to(device, dtype=dtype)
+ else:
+ raise FileNotFoundError(f"Could not find config.json in {transformer_path}")
+
+ transformer.eval()
+
+ # 5. Initialize scheduler
+ print("Initializing scheduler...")
+ scheduler = Scheduler(
+ mask_token_id=video_tokenizer.mask_token_id,
+ masking_schedule="cosine"
+ )
+ print(f"✓ Scheduler initialized (mask_token_id: {scheduler.config.mask_token_id})")
+
+ # 6. Create pipeline
+ print("Creating video pipeline...")
+ pipe = Pipeline(
+ tokenizer=tokenizer,
+ text_encoder=text_encoder,
+ transformer=transformer,
+ scheduler=scheduler,
+ video_tokenizer=video_tokenizer,
+ text_len=512,
+ num_frames=args.num_frames,
+ height=args.height,
+ width=args.width,
+ )
+ print(f"✓ Pipeline created")
+ print(f" Default video dimensions: {pipe.num_frames} frames, {pipe.height}x{pipe.width}")
+
+ # 7. Generate videos for each prompt
+ print("\n" + "=" * 80)
+ print("Generating Videos")
+ print("=" * 80)
+
+ total_start_time = time.time()
+ successful_generations = 0
+ failed_generations = 0
+ combination_idx = 0
+
+ for prompt_idx, prompt in enumerate(args.prompts):
+ for guidance_scale in guidance_scales:
+ combination_idx += 1
+
+ # Create output directory for this combination
+ if is_grid_search:
+ sanitized_prompt = prompt.replace(" ", "_")[:30].replace("/", "-").replace("\\", "-")
+ combo_dir = os.path.join(
+ args.output_dir,
+ f"prompt{prompt_idx:02d}_{sanitized_prompt}",
+ f"gs{guidance_scale:.1f}_steps{args.num_inference_steps}"
+ )
+ os.makedirs(combo_dir, exist_ok=True)
+ else:
+ combo_dir = args.output_dir
+ os.makedirs(combo_dir, exist_ok=True)
+
+ total_items = total_combinations if is_grid_search else len(args.prompts)
+ prompt_display = prompt[:60] + "..." if len(prompt) > 60 else prompt
+ print(f"\n[{combination_idx}/{total_items}] "
+ f"Prompt {prompt_idx+1}/{len(args.prompts)} | "
+ f"Guidance: {guidance_scale:.1f} | Steps: {args.num_inference_steps}")
+ print(f" Prompt: '{prompt_display}'")
+
+ try:
+ generation_start = time.time()
+
+ with torch.no_grad():
+ result = pipe(
+ prompt=prompt,
+ negative_prompt=args.negative_prompt,
+ num_frames=args.num_frames,
+ height=args.height,
+ width=args.width,
+ guidance_scale=guidance_scale,
+ num_inference_steps=args.num_inference_steps,
+ output_type="pil",
+ return_dict=True,
+ )
+
+ videos = result.videos
+
+ # Handle different output formats
+ if isinstance(videos, list) and len(videos) > 0:
+ video_frames = videos[0] if isinstance(videos[0], list) else videos
+ else:
+ video_frames = videos
+
+ generation_time = time.time() - generation_start
+ print(".2f")
+
+ # Create sanitized filename from prompt
+ sanitized_prompt = prompt.replace(" ", "_")[:50].replace("/", "-").replace("\\", "-")
+ if is_grid_search:
+ base_name = f"gs{guidance_scale:.1f}_steps{args.num_inference_steps}"
+ else:
+ base_name = f"{prompt_idx:02d}_{sanitized_prompt}"
+
+ # Save video as GIF
+ gif_path = os.path.join(combo_dir, f"{base_name}.gif")
+ save_video_as_gif(video_frames, gif_path, duration=int(1000 / args.fps))
+ print(f" ✓ Saved GIF: {gif_path}")
+
+ # Save as MP4 if requested
+ if args.save_mp4:
+ mp4_path = os.path.join(combo_dir, f"{base_name}.mp4")
+ save_video_as_mp4(video_frames, mp4_path, fps=args.fps)
+ print(f" ✓ Saved MP4: {mp4_path}")
+
+ # Save as frame grid
+ if args.save_grid:
+ if isinstance(video_frames[0], Image.Image):
+ frames_tensor = torch.stack([transforms.ToTensor()(f) for f in video_frames], dim=0)
+ else:
+ frames_tensor = torch.stack(video_frames, dim=0)
+
+ grid = make_grid(frames_tensor, nrow=min(8, len(video_frames)))
+ grid_path = os.path.join(combo_dir, f"{base_name}_grid.png")
+ save_image(grid, grid_path)
+ print(f" ✓ Saved grid: {grid_path}")
+
+ # Save individual frames
+ if args.save_frames:
+ frames_dir = os.path.join(combo_dir, f"{base_name}_frames")
+ save_video_frames(video_frames, frames_dir)
+ print(f" ✓ Saved frames: {frames_dir}")
+
+ successful_generations += 1
+
+ except Exception as e:
+ failed_generations += 1
+ print(f" ✗ Failed to generate video: {e}")
+ import traceback
+ traceback.print_exc()
+
+ total_time = time.time() - total_start_time
+
+ print("\n" + "=" * 80)
+ print("Generation Summary")
+ print("=" * 80)
+ if is_grid_search:
+ print(f"Grid search over {len(guidance_scales)} guidance scales × {len(args.prompts)} prompts")
+ print(f"Total combinations: {combination_idx}")
+ print(f"Successful: {successful_generations}")
+ print(f"Failed: {failed_generations}")
+ print(".2f")
+ if successful_generations > 0:
+ print(".2f")
+ print(f"Videos saved to: {args.output_dir}")
+ if is_grid_search:
+ print("\nDirectory structure:")
+ print(f" {args.output_dir}/")
+ print(" prompt00_/")
+ print(" gs_steps/")
+ print(" gs_steps.gif")
+ print("=" * 80)
+
+
+if __name__ == "__main__":
+ main()
diff --git a/Meissonic/train/infer_video.sh b/Meissonic/train/infer_video.sh
new file mode 100644
index 0000000000000000000000000000000000000000..15b35bf902f7f09d6931d97b6744d4ccedd0e5bf
--- /dev/null
+++ b/Meissonic/train/infer_video.sh
@@ -0,0 +1,54 @@
+# # 网格搜索:遍历不同的guidance scale和inference steps
+# python train/infer_mei_video.py \
+# --model_path /mnt/Meissonic/output/3500steps/43.1/transformer \
+# --guidance_scales 1.0 3.0 5.0 7.0 9.0 11.0 \
+# --inference_steps_list 32 48 64 80 100 \
+# --prompts "A cat playing with a ball" \
+# --output_dir ./grid_search_results \
+# --save_mp4
+
+# 只遍历guidance scale
+python train/infer_mei_video.py \
+ --pretrained_model_name_or_path /mnt/Meissonic/output/43.1/checkpoint-24500/transformer \
+ --guidance_scales 5.0 7.0 9.0 11.0 13.0 15.0 \
+ --num_inference_steps 128 \
+ --prompts \
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution." \
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner." \
+ "The video is a news segment from a television show. The style of the video is a combination of a live shot and a graphic overlay. The live shot features a woman standing in front of a building with a lawn and a fence. She is wearing a black coat and a green scarf. The graphic overlay includes a blue banner with white text that reads \"NEW THIS MORNING WAR OF WORDS FROM TRUMP AND CRUZ DEMOCRATS GEAR UP FOR DEBATE TONIGHT\". The text is in all caps and is positioned at the bottom of the screen. The overall style of the video is professional and informative." \
+ --output_dir ./gs_search_24500
+
+
+# --prompts "A person walking" \
+# --output_dir ./gs_search
+
+# python train/infer_mei_video.py \
+# --pretrained_model_name_or_path /mnt/Meissonic/output/43.1/checkpoint-8500/transformer \
+# --guidance_scales 1.0 3.0 5.0 7.0 9.0 11.0 \
+# --num_inference_steps 1000 \
+# --prompts "mountain" \
+# --output_dir ./gs_search
+
+# python train/infer_mei_video.py \
+# --pretrained_model_name_or_path /mnt/Meissonic/output/43.1/checkpoint-8500/transformer \
+# --guidance_scales 1.0 3.0 5.0 7.0 9.0 11.0 \
+# --num_inference_steps 1000 \
+# --prompts "The video is a news segment from a television show. The style of the video is a combination of a live shot and a graphic overlay. The live shot features a woman standing in front of a building with a lawn and a fence. She is wearing a black coat and a green scarf. The graphic overlay includes a blue banner with white text that reads \"NEW THIS MORNING WAR OF WORDS FROM TRUMP AND CRUZ DEMOCRATS GEAR UP FOR DEBATE TONIGHT\". The text is in all caps and is positioned at the bottom of the screen. The overall style of the video is professional and informative." \
+# --output_dir ./gs_search
+
+
+# # 只遍历inference steps
+# python train/infer_mei_video.py \
+# --model_path /mnt/Meissonic/output/3500steps/43.1/transformer \
+# --guidance_scale 7.0 \
+# --inference_steps_list 32 48 64 80 100 \
+# --prompts "A person walking" \
+# --output_dir ./steps_search
+
+# # 单个参数(不进行网格搜索,保持原有行为)
+# python train/infer_mei_video.py \
+# --model_path /mnt/Meissonic/output/3000steps \
+# --guidance_scale 7.0 \
+# --num_inference_steps 64 \
+# --prompts "A car driving" \
+# --output_dir ./single_output
\ No newline at end of file
diff --git a/Meissonic/train/run_overfit.sh b/Meissonic/train/run_overfit.sh
new file mode 100644
index 0000000000000000000000000000000000000000..0091e75da8bf68d04fe04dbf96b18c41e7d84ee1
--- /dev/null
+++ b/Meissonic/train/run_overfit.sh
@@ -0,0 +1,29 @@
+#!/bin/bash
+# Overfitting experiment script
+# This script runs a small overfitting experiment to verify implementation correctness
+
+accelerate launch --multi_gpu --gpu_ids '0,1,2,3,4,5,6,7' --main_process_port 25011 --num_processes 8 train/train_overfit.py \
+ --text_encoder_architecture umt5-base \
+ --video_tokenizer_model_id "Cosmos-1.0-Tokenizer-DV8x16x16" \
+ --instance_data_dir "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv" \
+ --max_samples 256 \
+ --num_frames 8 \
+ --video_height 64 \
+ --video_width 112 \
+ --dataloader_num_workers 8 \
+ --train_batch_size 1 \
+ --gradient_accumulation_steps 1 \
+ --learning_rate 5e-4 \
+ --max_train_steps 3000 \
+ --lr_warmup_steps 100 \
+ --gradient_checkpointing \
+ --mixed_precision bf16 \
+ --seed 42 \
+ --output_dir "./output_overfit" \
+ --logging_steps 50 \
+ --save_steps 500 \
+ --inference_steps 500 \
+ --num_inference_samples 4 \
+ --num_inference_steps 48 \
+ --wan_pretrained_path Wan-AI/Wan2.1-T2V-1.3B # Optional: path to pretrained Wan weights
+
diff --git a/Meissonic/train/test_cosmos_vqvae.py b/Meissonic/train/test_cosmos_vqvae.py
new file mode 100644
index 0000000000000000000000000000000000000000..30a15da3870c15fa471dc39078c7e5f55cb4ab53
--- /dev/null
+++ b/Meissonic/train/test_cosmos_vqvae.py
@@ -0,0 +1,500 @@
+#!/usr/bin/env python3
+"""
+Test script for Cosmos VQ-VAE performance.
+
+This script:
+1. Loads a video from the training dataset
+2. Encodes it using CosmosVideoTokenizer
+3. Decodes it back
+4. Computes metrics (PSNR, SSIM, MSE)
+5. Creates a side-by-side comparison video
+6. Saves the results
+"""
+
+import argparse
+import os
+import sys
+sys.path.append(os.getcwd())
+
+import torch
+import numpy as np
+from PIL import Image
+import cv2
+from torchvision import transforms
+from torchvision.utils import make_grid, save_image
+
+from src.pipeline_video import CosmosVideoTokenizer
+from train.dataset_utils import OpenVid1MDataset, TinyOpenVid1MDataset
+from transformers import T5Tokenizer
+
+
+def calculate_psnr(img1, img2, max_val=1.0):
+ """Calculate PSNR between two images."""
+ mse = torch.mean((img1 - img2) ** 2)
+ if mse == 0:
+ return float('inf')
+ psnr = 20 * torch.log10(max_val / torch.sqrt(mse))
+ return psnr.item()
+
+
+def calculate_mse(img1, img2):
+ """Calculate MSE between two images."""
+ return torch.mean((img1 - img2) ** 2).item()
+
+
+def calculate_ssim(img1, img2, window_size=11):
+ """Calculate SSIM between two images (simplified version)."""
+ # Simple SSIM approximation
+ C1 = 0.01 ** 2
+ C2 = 0.03 ** 2
+
+ mu1 = img1.mean()
+ mu2 = img2.mean()
+
+ sigma1_sq = img1.var()
+ sigma2_sq = img2.var()
+ sigma12 = ((img1 - mu1) * (img2 - mu2)).mean()
+
+ ssim = ((2 * mu1 * mu2 + C1) * (2 * sigma12 + C2)) / ((mu1**2 + mu2**2 + C1) * (sigma1_sq + sigma2_sq + C2))
+ return ssim.item()
+
+
+def video_to_numpy(video_tensor):
+ """
+ Convert video tensor [C, F, H, W] in [0, 1] to numpy array [F, H, W, C] in [0, 255] (RGB).
+ """
+ if isinstance(video_tensor, torch.Tensor):
+ # [C, F, H, W] -> [F, C, H, W] -> [F, H, W, C]
+ # First move frame dimension to front, then transpose channels to last
+ video_np = video_tensor.permute(1, 0, 2, 3).cpu().numpy() # [F, C, H, W]
+ video_np = np.transpose(video_np, (0, 2, 3, 1)) # [F, H, W, C]
+ # Clamp to [0, 1] and convert to [0, 255]
+ video_np = np.clip(video_np, 0, 1)
+ video_np = (video_np * 255).astype(np.uint8)
+ else:
+ video_np = np.array(video_tensor)
+ return video_np
+
+
+def create_side_by_side_video(original, reconstructed, output_path, fps=8):
+ """
+ Create a side-by-side comparison video.
+
+ Args:
+ original: Original video tensor [C, F, H, W] or numpy array
+ reconstructed: Reconstructed video tensor [C, F, H, W] or numpy array
+ output_path: Path to save the output video
+ fps: Frames per second
+ """
+ # Convert to numpy (RGB format: [F, H, W, C])
+ orig_np = video_to_numpy(original)
+ recon_np = video_to_numpy(reconstructed)
+
+ # Get dimensions
+ F, H, W, C = orig_np.shape
+ F_recon, H_recon, W_recon, C_recon = recon_np.shape
+
+ # Ensure same number of frames
+ F_min = min(F, F_recon)
+ orig_np = orig_np[:F_min]
+ recon_np = recon_np[:F_min]
+
+ # Resize if dimensions don't match
+ if H != H_recon or W != W_recon:
+ print(f"Resizing reconstructed video from ({H_recon}, {W_recon}) to ({H}, {W})")
+ recon_np_resized = np.zeros((F_min, H, W, C), dtype=np.uint8)
+ for f in range(F_min):
+ # cv2.resize expects (width, height) for size parameter
+ recon_np_resized[f] = cv2.resize(recon_np[f], (W, H), interpolation=cv2.INTER_LINEAR)
+ recon_np = recon_np_resized
+
+ # Add text labels to frames
+ from PIL import Image, ImageDraw, ImageFont
+ side_by_side_frames = []
+ for f in range(F_min):
+ # Original frame with label
+ orig_frame_pil = Image.fromarray(orig_np[f])
+ draw = ImageDraw.Draw(orig_frame_pil)
+ try:
+ font = ImageFont.truetype("/usr/share/fonts/truetype/dejavu/DejaVuSans-Bold.ttf", 32)
+ except:
+ try:
+ font = ImageFont.truetype("/System/Library/Fonts/Helvetica.ttc", 32)
+ except:
+ font = ImageFont.load_default()
+ # Draw text with outline for visibility
+ text = "Original"
+ x, y = 20, 20
+ for adj in [(-1, -1), (-1, 0), (-1, 1), (0, -1), (0, 1), (1, -1), (1, 0), (1, 1)]:
+ draw.text((x + adj[0], y + adj[1]), text, font=font, fill=(0, 0, 0))
+ draw.text((x, y), text, font=font, fill=(255, 255, 255))
+ orig_frame = np.array(orig_frame_pil)
+
+ # Reconstructed frame with label
+ recon_frame_pil = Image.fromarray(recon_np[f])
+ draw = ImageDraw.Draw(recon_frame_pil)
+ text = "Reconstructed"
+ x, y = 20, 20
+ for adj in [(-1, -1), (-1, 0), (-1, 1), (0, -1), (0, 1), (1, -1), (1, 0), (1, 1)]:
+ draw.text((x + adj[0], y + adj[1]), text, font=font, fill=(0, 0, 0))
+ draw.text((x, y), text, font=font, fill=(255, 255, 0)) # Yellow text
+ recon_frame = np.array(recon_frame_pil)
+
+ # Concatenate horizontally
+ frame = np.concatenate([orig_frame, recon_frame], axis=1)
+ side_by_side_frames.append(frame)
+
+ # Write video using OpenCV (needs BGR format)
+ fourcc = cv2.VideoWriter_fourcc(*'mp4v')
+ out = cv2.VideoWriter(output_path, fourcc, fps, (W * 2, H))
+
+ if not out.isOpened():
+ print(f"Warning: Could not open video writer with mp4v codec, trying XVID...")
+ fourcc = cv2.VideoWriter_fourcc(*'XVID')
+ out = cv2.VideoWriter(output_path, fourcc, fps, (W * 2, H))
+
+ for frame in side_by_side_frames:
+ # Convert RGB to BGR for OpenCV
+ frame_bgr = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
+ out.write(frame_bgr)
+
+ out.release()
+ print(f"Saved side-by-side video to: {output_path}")
+
+
+def add_text_to_image(image_tensor, text, position=(10, 30)):
+ """
+ Add text label to an image tensor.
+
+ Args:
+ image_tensor: Image tensor [C, H, W] in [0, 1]
+ text: Text to add
+ position: (x, y) position for text
+ Returns:
+ Image tensor with text [C, H, W]
+ """
+ # Convert to PIL Image
+ image_np = image_tensor.permute(1, 2, 0).cpu().numpy() # [H, W, C]
+ image_np = np.clip(image_np, 0, 1)
+ image_np = (image_np * 255).astype(np.uint8)
+ pil_image = Image.fromarray(image_np)
+
+ # Add text
+ from PIL import ImageDraw, ImageFont
+ draw = ImageDraw.Draw(pil_image)
+ try:
+ font = ImageFont.truetype("/usr/share/fonts/truetype/dejavu/DejaVuSans-Bold.ttf", 24)
+ except:
+ try:
+ font = ImageFont.truetype("/System/Library/Fonts/Helvetica.ttc", 24)
+ except:
+ font = ImageFont.load_default()
+
+ # Draw white text with black outline
+ x, y = position
+ # Draw outline
+ for adj in [(-1, -1), (-1, 0), (-1, 1), (0, -1), (0, 1), (1, -1), (1, 0), (1, 1)]:
+ draw.text((x + adj[0], y + adj[1]), text, font=font, fill=(0, 0, 0))
+ # Draw main text
+ draw.text((x, y), text, font=font, fill=(255, 255, 255))
+
+ # Convert back to tensor
+ image_tensor = transforms.ToTensor()(pil_image)
+ return image_tensor
+
+
+def create_comparison_grid(original, reconstructed, output_path, nrow=4):
+ """
+ Create a grid image comparing original and reconstructed frames.
+
+ Args:
+ original: Original video tensor [C, F, H, W]
+ reconstructed: Reconstructed video tensor [C, F, H, W]
+ output_path: Path to save the grid image
+ nrow: Number of frames per row
+ """
+ # Get number of frames
+ F = min(original.shape[1], reconstructed.shape[1])
+
+ # Select frames to display
+ num_frames_to_show = min(8, F)
+ frame_indices = np.linspace(0, F - 1, num_frames_to_show, dtype=int)
+
+ frames_list = []
+ for idx in frame_indices:
+ # Original frame with label
+ orig_frame = original[:, idx, :, :].clone() # [C, H, W]
+ orig_frame = add_text_to_image(orig_frame, "Original", position=(10, 10))
+ frames_list.append(orig_frame)
+
+ # Reconstructed frame with label
+ recon_frame = reconstructed[:, idx, :, :].clone() # [C, H, W]
+ recon_frame = add_text_to_image(recon_frame, "Reconstructed", position=(10, 10))
+ frames_list.append(recon_frame)
+
+ # Create grid
+ frames_tensor = torch.stack(frames_list, dim=0)
+ grid = make_grid(frames_tensor, nrow=nrow * 2, padding=2, pad_value=1.0)
+
+ save_image(grid, output_path)
+ print(f"Saved comparison grid to: {output_path}")
+
+
+def parse_args():
+ parser = argparse.ArgumentParser(description="Test Cosmos VQ-VAE performance")
+
+ parser.add_argument(
+ "--csv_path",
+ type=str,
+ required=True,
+ help="Path to OpenVid1M CSV file"
+ )
+ parser.add_argument(
+ "--video_root_dir",
+ type=str,
+ default=None,
+ help="Root directory for videos (auto-detected if not provided)"
+ )
+ parser.add_argument(
+ "--video_index",
+ type=int,
+ default=0,
+ help="Index of video to test (default: 0)"
+ )
+ parser.add_argument(
+ "--video_tokenizer_model_id",
+ type=str,
+ default="Cosmos-1.0-Tokenizer-DV8x16x16",
+ help="Cosmos tokenizer model ID"
+ )
+ parser.add_argument(
+ "--num_frames",
+ type=int,
+ default=16,
+ help="Number of frames"
+ )
+ parser.add_argument(
+ "--height",
+ type=int,
+ default=480,
+ help="Video height"
+ )
+ parser.add_argument(
+ "--width",
+ type=int,
+ default=848,
+ help="Video width"
+ )
+ parser.add_argument(
+ "--output_dir",
+ type=str,
+ default="./cosmos_test_output",
+ help="Output directory for results"
+ )
+ parser.add_argument(
+ "--device",
+ type=str,
+ default="cuda" if torch.cuda.is_available() else "cpu",
+ help="Device to use"
+ )
+ parser.add_argument(
+ "--dtype",
+ type=str,
+ default="float32",
+ choices=["float32", "float16", "bfloat16"],
+ help="Data type"
+ )
+
+ return parser.parse_args()
+
+
+def main():
+ args = parse_args()
+
+ # Create output directory
+ os.makedirs(args.output_dir, exist_ok=True)
+
+ # Set device and dtype
+ device = torch.device(args.device)
+ if args.dtype == "float16":
+ dtype = torch.float16
+ elif args.dtype == "bfloat16":
+ dtype = torch.bfloat16
+ else:
+ dtype = torch.float32
+
+ print(f"Using device: {device}, dtype: {dtype}")
+
+ # Initialize tokenizer
+ print("Initializing CosmosVideoTokenizer...")
+ video_tokenizer = CosmosVideoTokenizer(
+ model_id=args.video_tokenizer_model_id,
+ device=device,
+ dtype=dtype
+ )
+ print(f"Codebook size: {video_tokenizer.codebook_size}")
+ print(f"Downsampling factors: t={video_tokenizer.t_downsample}, "
+ f"h={video_tokenizer.h_downsample}, w={video_tokenizer.w_downsample}")
+
+ # Load dataset
+ print(f"Loading dataset from: {args.csv_path}")
+
+ # Auto-detect video_root_dir if not provided
+ video_root_dir = args.video_root_dir
+ if video_root_dir is None:
+ csv_dir = os.path.dirname(args.csv_path)
+ if os.path.exists(os.path.join(csv_dir, 'video_reorg')):
+ video_root_dir = os.path.join(csv_dir, 'video_reorg')
+ elif os.path.exists(os.path.join(os.path.dirname(csv_dir), 'video_reorg')):
+ video_root_dir = os.path.join(os.path.dirname(csv_dir), 'video_reorg')
+ else:
+ video_root_dir = csv_dir
+ print(f"Warning: Video directory not found, using CSV directory: {video_root_dir}")
+
+ # Initialize tokenizer for dataset (needed for OpenVid1MDataset)
+ tokenizer = T5Tokenizer.from_pretrained("google/umt5-base")
+
+ # Create dataset
+ dataset = OpenVid1MDataset(
+ csv_path=args.csv_path,
+ video_root_dir=video_root_dir,
+ tokenizer=tokenizer,
+ num_frames=args.num_frames,
+ height=args.height,
+ width=args.width,
+ text_encoder_architecture="umt5-base",
+ )
+
+ print(f"Dataset size: {len(dataset)}")
+
+ # Load video
+ if args.video_index >= len(dataset):
+ print(f"Error: video_index {args.video_index} >= dataset size {len(dataset)}")
+ return
+
+ print(f"Loading video at index {args.video_index}...")
+ sample = dataset[args.video_index]
+ original_video = sample["video"] # [C, F, H, W]
+
+ # Get video info from dataset
+ row = dataset.data[args.video_index]
+ video_path = row.get('video', 'unknown')
+ caption = row.get('caption', 'no caption')
+
+ print(f"Video path: {video_path}")
+ print(f"Caption: {caption}")
+ print(f"Original video shape: {original_video.shape}")
+ print(f"Original video range: [{original_video.min():.3f}, {original_video.max():.3f}]")
+
+ # Move to device
+ original_video = original_video.to(device=device, dtype=dtype)
+
+ # Encode
+ print("\nEncoding video...")
+ with torch.no_grad():
+ codes = video_tokenizer.encode(original_video.unsqueeze(0)) # [1, F', H', W']
+
+ print(f"Encoded codes shape: {codes.shape}")
+ print(f"Codes range: [{codes.min().item()}, {codes.max().item()}]")
+ print(f"Codebook size: {video_tokenizer.codebook_size}")
+
+ # Decode
+ print("\nDecoding video...")
+ with torch.no_grad():
+ reconstructed_video = video_tokenizer.decode(codes) # [1, C, F, H, W]
+ reconstructed_video = reconstructed_video.squeeze(0) # [C, F, H, W]
+
+ print(f"Reconstructed video shape: {reconstructed_video.shape}")
+ print(f"Reconstructed video range: [{reconstructed_video.min():.3f}, {reconstructed_video.max():.3f}]")
+
+ # Ensure same number of frames for comparison
+ F_orig = original_video.shape[1]
+ F_recon = reconstructed_video.shape[1]
+ F_min = min(F_orig, F_recon)
+
+ original_video = original_video[:, :F_min, :, :]
+ reconstructed_video = reconstructed_video[:, :F_min, :, :]
+
+ # Resize if spatial dimensions don't match
+ if original_video.shape[2:] != reconstructed_video.shape[2:]:
+ print(f"Resizing reconstructed video from {reconstructed_video.shape[2:]} to {original_video.shape[2:]}")
+ # Use interpolation to resize
+ reconstructed_video_resized = torch.zeros_like(original_video)
+ for f in range(F_min):
+ frame = reconstructed_video[:, f, :, :].unsqueeze(0) # [1, C, H, W]
+ frame_resized = torch.nn.functional.interpolate(
+ frame, size=original_video.shape[2:], mode='bilinear', align_corners=False
+ )
+ reconstructed_video_resized[:, f, :, :] = frame_resized.squeeze(0)
+ reconstructed_video = reconstructed_video_resized
+
+ # Calculate metrics
+ print("\nCalculating metrics...")
+
+ # Convert to float32 for metric calculation
+ orig_f32 = original_video.to(torch.float32)
+ recon_f32 = reconstructed_video.to(torch.float32)
+
+ # Frame-wise metrics
+ psnr_values = []
+ mse_values = []
+ ssim_values = []
+
+ for f in range(F_min):
+ orig_frame = orig_f32[:, f, :, :] # [C, H, W]
+ recon_frame = recon_f32[:, f, :, :] # [C, H, W]
+
+ psnr = calculate_psnr(orig_frame, recon_frame)
+ mse = calculate_mse(orig_frame, recon_frame)
+ ssim = calculate_ssim(orig_frame, recon_frame)
+
+ psnr_values.append(psnr)
+ mse_values.append(mse)
+ ssim_values.append(ssim)
+
+ # Overall metrics
+ avg_psnr = np.mean(psnr_values)
+ avg_mse = np.mean(mse_values)
+ avg_ssim = np.mean(ssim_values)
+
+ print(f"\n=== Metrics ===")
+ print(f"PSNR: {avg_psnr:.2f} dB (per frame: {psnr_values})")
+ print(f"MSE: {avg_mse:.6f} (per frame: {mse_values})")
+ print(f"SSIM: {avg_ssim:.4f} (per frame: {ssim_values})")
+
+ # Save metrics to file
+ metrics_file = os.path.join(args.output_dir, f"metrics_video_{args.video_index}.txt")
+ with open(metrics_file, 'w') as f:
+ f.write(f"Video Index: {args.video_index}\n")
+ f.write(f"Video Path: {video_path}\n")
+ f.write(f"Caption: {caption}\n")
+ f.write(f"\n=== Metrics ===\n")
+ f.write(f"Average PSNR: {avg_psnr:.2f} dB\n")
+ f.write(f"Average MSE: {avg_mse:.6f}\n")
+ f.write(f"Average SSIM: {avg_ssim:.4f}\n")
+ f.write(f"\nPer-frame PSNR: {psnr_values}\n")
+ f.write(f"Per-frame MSE: {mse_values}\n")
+ f.write(f"Per-frame SSIM: {ssim_values}\n")
+
+ print(f"Saved metrics to: {metrics_file}")
+
+ # Create side-by-side video
+ print("\nCreating side-by-side comparison video...")
+ video_output_path = os.path.join(args.output_dir, f"comparison_video_{args.video_index}.mp4")
+ create_side_by_side_video(original_video, reconstructed_video, video_output_path, fps=8)
+
+ # Create comparison grid
+ print("Creating comparison grid...")
+ grid_output_path = os.path.join(args.output_dir, f"comparison_grid_video_{args.video_index}.png")
+ create_comparison_grid(original_video, reconstructed_video, grid_output_path, nrow=4)
+
+ print(f"\n=== Test Complete ===")
+ print(f"Results saved to: {args.output_dir}")
+ print(f" - Metrics: {metrics_file}")
+ print(f" - Side-by-side video: {video_output_path}")
+ print(f" - Comparison grid: {grid_output_path}")
+
+
+if __name__ == "__main__":
+ main()
+
diff --git a/Meissonic/train/test_cosmos_vqvae.sh b/Meissonic/train/test_cosmos_vqvae.sh
new file mode 100644
index 0000000000000000000000000000000000000000..c6bdaea7cda7b6bdaa1e4a56de74443202793331
--- /dev/null
+++ b/Meissonic/train/test_cosmos_vqvae.sh
@@ -0,0 +1,62 @@
+#!/bin/bash
+# Test script for Cosmos VQ-VAE performance
+
+
+python train/test_cosmos_vqvae.py \
+ --csv_path "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv" \
+ --video_index 0 \
+ --video_tokenizer_model_id "Cosmos-0.1-Tokenizer-DV4x8x8" \
+ --num_frames 17 \
+ --height 128 \
+ --width 128 \
+ --output_dir "./cosmos_test_output" \
+ --device cuda \
+ --dtype float32
+
+# python train/test_cosmos_vqvae.py \
+# --csv_path "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv" \
+# --video_index 0 \
+# --video_tokenizer_model_id "Cosmos-0.1-Tokenizer-DV4x8x8" \
+# --num_frames 16 \
+# --height 480 \
+# --width 848 \
+# --output_dir "./cosmos_test_output" \
+# --device cuda \
+# --dtype float32
+
+
+# python train/test_cosmos_vqvae.py \
+# --csv_path "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv" \
+# --video_index 1 \
+# --video_tokenizer_model_id "Cosmos-0.1-Tokenizer-DV4x8x8" \
+# --num_frames 16 \
+# --height 480 \
+# --width 848 \
+# --output_dir "./cosmos_test_output" \
+# --device cuda \
+# --dtype float32
+
+# python train/test_cosmos_vqvae.py \
+# --csv_path "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv" \
+# --video_index 2 \
+# --video_tokenizer_model_id "Cosmos-0.1-Tokenizer-DV4x8x8" \
+# --num_frames 16 \
+# --height 480 \
+# --width 848 \
+# --output_dir "./cosmos_test_output" \
+# --device cuda \
+# --dtype float32
+
+
+# python train/test_cosmos_vqvae.py \
+# --csv_path "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv" \
+# --video_index 3 \
+# --video_tokenizer_model_id "Cosmos-0.1-Tokenizer-DV4x8x8" \
+# --num_frames 16 \
+# --height 480 \
+# --width 848 \
+# --output_dir "./cosmos_test_output" \
+# --device cuda \
+# --dtype float32
+
+
diff --git a/Meissonic/train/train.sh b/Meissonic/train/train.sh
new file mode 100644
index 0000000000000000000000000000000000000000..6667ecd895a97862e05dba000d775a8849e54f13
--- /dev/null
+++ b/Meissonic/train/train.sh
@@ -0,0 +1,33 @@
+# bash it in root path
+PYTHON_PATH='./' accelerate launch --multi_gpu --gpu_ids '0,1,2,3' --main_process_port 25011 --num_processes 4 train/train_meissonic.py \
+ --output_dir "../CKPT_OUTPUT_PATH" \
+ --train_batch_size 4 \
+ --gradient_accumulation_steps 2 \
+ --learning_rate 1e-4 \
+ --max_grad_norm 10 \
+ --pretrained_model_name_or_path "meissonflow/meissonic" \
+ --text_encoder_architecture 'open_clip' \
+ --pretrained_model_architecture 'Meissonic' \
+ --training_from_scratch True \
+ --instance_dataset 'DATA_TYPE' \
+ --instance_data_dir '../parquets_father_dir/' \
+ --resolution 1024 \
+ --mixed_precision bf16 \
+ --lr_scheduler constant \
+ --use_8bit_adam \
+ --dataloader_num_workers 64 \
+ --validation_prompts \
+ 'a boy' \
+ 'A serene mountain landscape with towering snow-capped peaks, a crystal-clear blue lake reflecting the mountains, dense pine forests, and a vibrant orange sunrise illuminating the sky.' \
+ 'A playful golden retriever puppy with a shiny coat, bounding through a meadow filled with colorful wildflowers, under a bright, clear blue sky.' \
+ 'A bustling city street at night, illuminated by vibrant neon signs in various colors, with busy pedestrians, street vendors, and a light rain creating reflective puddles on the pavement.' \
+ 'A majestic, medieval castle perched on a rugged cliffside, overlooking a vast, calm ocean at sunset, with the sky painted in hues of pink, orange, and purple.' \
+ 'An elegant ballerina in a white tutu, dancing gracefully on a grand stage with ornate, gold-trimmed curtains, under a spotlight that casts a soft glow.' \
+ 'A cozy, rustic log cabin nestled in a snow-covered forest, with smoke rising from the stone chimney, warm lights glowing from the windows, and a path of footprints leading to the front door.'\
+ 'A Cute Cat' \
+ 'A Snow Mountain'\
+ --max_train_steps 100000 \
+ --checkpointing_steps 1000 \
+ --validation_steps 200 \
+ --report_to 'wandb' \
+ --logging_steps 10
diff --git a/Meissonic/train/train_mei_video.py b/Meissonic/train/train_mei_video.py
new file mode 100644
index 0000000000000000000000000000000000000000..c392f50ad44cf6625ef2f6823c5730f128501a62
--- /dev/null
+++ b/Meissonic/train/train_mei_video.py
@@ -0,0 +1,1928 @@
+# Copyright 2024 The HuggingFace Team and The MeissonFlow Team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import argparse
+import copy
+import logging
+import math
+import os
+os.environ["WANDB_DISABLED"] = "true"
+from contextlib import nullcontext
+from pathlib import Path
+import sys
+sys.path.append(os.getcwd())
+import torch
+import torch.nn.functional as F
+from accelerate import Accelerator
+from accelerate.logging import get_logger
+from accelerate.utils import ProjectConfiguration, set_seed
+from peft import LoraConfig
+from peft.utils import get_peft_model_state_dict
+from torch.utils.data import DataLoader, default_collate
+from torchvision import transforms
+# Video training uses T5/UMT5, not CLIP
+import diffusers.optimization
+# EMA not used for video training
+from src.scheduler_video import Scheduler
+# Video training only - no image scheduler needed
+from diffusers.loaders import LoraLoaderMixin
+from diffusers.utils import is_wandb_available
+from src.pipeline_video import Pipeline as VideoPipeline
+from torchvision.utils import save_image,make_grid
+from datasets import load_dataset
+from train.trainer_utils import save_checkpoint
+from train.dataset_utils import VideoDataset, OpenVid1MDataset, PrecomputedFeatureDataset, PrecomputedVideoOnlyDataset
+from train.dataset_utils import tokenize_prompt, encode_prompt
+# Transformer2DModel removed - video training only uses WanDiscreteVideoTransformer
+from src.transformer_video import WanDiscreteVideoTransformer, WanModel
+from src.pipeline_video import CosmosVideoTokenizer
+from transformers import T5Tokenizer, T5EncoderModel
+
+if is_wandb_available():
+ import wandb
+ wandb.login(key="a96e0066098e5f64211a77b604ba2b1dd7bd7834")
+
+# t5_model_path = "/inspire/hdd/project/multimodal-discrete-diffusion/xinyi-253308120310/maskgit-video/43/Meissonic/google_umt5_xxl"
+# # cosmos_model_path = ""
+# wan_model_path = "/inspire/hdd/project/multimodal-discrete-diffusion/xinyi-253308120310/maskgit-video/43/Meissonic/Wan2.1-T2V-1.3B"
+
+logger = get_logger(__name__, log_level="INFO")
+
+import torch._dynamo
+torch._dynamo.config.verbose = True
+
+# Optionally suppress errors to fall back to eager execution
+torch._dynamo.config.suppress_errors = True
+
+def parse_args():
+ parser = argparse.ArgumentParser()
+ parser.add_argument(
+ "--text_encoder_architecture",
+ type=str,
+ default="open_clip",
+ required=False,
+ help="The architecture of the text encoder. For video training, must be 'umt5-base', 'umt5-xxl', or 't5'",
+ )
+ parser.add_argument(
+ "--instance_dataset",
+ type=str,
+ default=None,
+ required=False,
+ help="The dataset to use for training. One of ['MSCOCO600K', 'PickaPicV2']",
+ )
+ parser.add_argument("--training_from_scratch", action="store_true")
+
+ parser.add_argument(
+ "--pretrained_model_name_or_path",
+ type=str,
+ default=None,
+ required=True,
+ help="Path to pretrained model or model identifier from huggingface.co/models.",
+ )
+ parser.add_argument(
+ "--revision",
+ type=str,
+ default=None,
+ required=False,
+ help="Revision of pretrained model identifier from huggingface.co/models.",
+ )
+ parser.add_argument(
+ "--variant",
+ type=str,
+ default=None,
+ help="Variant of the model files of the pretrained model identifier from huggingface.co/models, 'e.g.' fp16",
+ )
+ parser.add_argument(
+ "--instance_data_dataset",
+ type=str,
+ default=None,
+ required=False,
+ help="A Hugging Face dataset containing the training images",
+ )
+ parser.add_argument(
+ "--instance_data_dir",
+ type=str,
+ default=None,
+ required=False,
+ help="A folder containing the training data of instance images.",
+ )
+ parser.add_argument(
+ "--instance_data_image", type=str, default=None, required=False, help="A single training image"
+ )
+ parser.add_argument(
+ "--use_8bit_adam", action="store_true", help="Whether or not to use 8-bit Adam from bitsandbytes."
+ )
+ parser.add_argument(
+ "--dataloader_num_workers",
+ type=int,
+ default=4,
+ help=(
+ "Number of subprocesses to use for data loading. 0 means that the data will be loaded in the main process. "
+ "Recommended: 4-8 for video loading. Set to 0 if you encounter issues with multiprocessing."
+ ),
+ )
+ parser.add_argument(
+ "--dataloader_prefetch_factor",
+ type=int,
+ default=2,
+ help=(
+ "Number of batches loaded in advance by each worker. Higher values can improve GPU utilization "
+ "but use more memory. Default: 2."
+ ),
+ )
+ parser.add_argument(
+ "--allow_tf32",
+ action="store_true",
+ help=(
+ "Whether or not to allow TF32 on Ampere GPUs. Can be used to speed up training. For more information, see"
+ " https://pytorch.org/docs/stable/notes/cuda.html#tensorfloat-32-tf32-on-ampere-devices"
+ ),
+ )
+ parser.add_argument("--use_ema", action="store_true", help="Whether to use EMA model.")
+ parser.add_argument("--ema_decay", type=float, default=0.9999)
+ parser.add_argument("--ema_update_after_step", type=int, default=0)
+ parser.add_argument("--adam_beta1", type=float, default=0.9, help="The beta1 parameter for the Adam optimizer.")
+ parser.add_argument("--adam_beta2", type=float, default=0.999, help="The beta2 parameter for the Adam optimizer.")
+ parser.add_argument("--adam_weight_decay", type=float, default=1e-2, help="Weight decay to use.")
+ parser.add_argument("--adam_epsilon", type=float, default=1e-08, help="Epsilon value for the Adam optimizer")
+ parser.add_argument(
+ "--output_dir",
+ type=str,
+ default="muse_training",
+ help="The output directory where the model predictions and checkpoints will be written.",
+ )
+ parser.add_argument("--seed", type=int, default=None, help="A seed for reproducible training.")
+ parser.add_argument(
+ "--logging_dir",
+ type=str,
+ default="logs",
+ help=(
+ "[TensorBoard](https://www.tensorflow.org/tensorboard) log directory. Will default to"
+ " *output_dir/runs/**CURRENT_DATETIME_HOSTNAME***."
+ ),
+ )
+ parser.add_argument(
+ "--max_train_steps",
+ type=int,
+ default=None,
+ help="Total number of training steps to perform. If provided, overrides num_train_epochs.",
+ )
+ parser.add_argument(
+ "--checkpointing_steps",
+ type=int,
+ default=500,
+ help=(
+ "Save a checkpoint of the training state every X updates. Checkpoints can be used for resuming training via `--resume_from_checkpoint`. "
+ "In the case that the checkpoint is better than the final trained model, the checkpoint can also be used for inference."
+ "Using a checkpoint for inference requires separate loading of the original pipeline and the individual checkpointed model components."
+ "See https://huggingface.co/docs/diffusers/main/en/training/dreambooth#performing-inference-using-a-saved-checkpoint for step by step"
+ "instructions."
+ ),
+ )
+ parser.add_argument(
+ "--logging_steps",
+ type=int,
+ default=50,
+ )
+ parser.add_argument(
+ "--checkpoints_total_limit",
+ type=int,
+ default=None,
+ help=(
+ "Max number of checkpoints to store. Passed as `total_limit` to the `Accelerator` `ProjectConfiguration`."
+ " See Accelerator::save_state https://huggingface.co/docs/accelerate/package_reference/accelerator#accelerate.Accelerator.save_state"
+ " for more details"
+ ),
+ )
+ parser.add_argument(
+ "--resume_from_checkpoint",
+ type=str,
+ default=None,
+ help=(
+ "Whether training should be resumed from a previous checkpoint. Use a path saved by"
+ ' `--checkpointing_steps`, or `"latest"` to automatically select the last available checkpoint.'
+ ),
+ )
+ parser.add_argument(
+ "--train_batch_size", type=int, default=16, help="Batch size (per device) for the training dataloader."
+ )
+ parser.add_argument(
+ "--gradient_accumulation_steps",
+ type=int,
+ default=1,
+ help="Number of updates steps to accumulate before performing a backward/update pass.",
+ )
+ parser.add_argument(
+ "--learning_rate",
+ type=float,
+ default=0.0003,
+ help="Initial learning rate (after the potential warmup period) to use.",
+ )
+ parser.add_argument(
+ "--scale_lr",
+ action="store_true",
+ default=False,
+ help="Scale the learning rate by the number of GPUs, gradient accumulation steps, and batch size.",
+ )
+ parser.add_argument(
+ "--lr_scheduler",
+ type=str,
+ default="constant",
+ help=(
+ 'The scheduler type to use. Choose between ["linear", "cosine", "cosine_with_restarts", "polynomial",'
+ ' "constant", "constant_with_warmup"]'
+ ),
+ )
+ parser.add_argument(
+ "--lr_warmup_steps", type=int, default=500, help="Number of steps for the warmup in the lr scheduler."
+ )
+ parser.add_argument(
+ "--validation_steps",
+ type=int,
+ default=100,
+ help=(
+ "Run validation every X steps. Validation consists of running the prompt"
+ " `args.validation_prompt` multiple times: `args.num_validation_images`"
+ " and logging the images."
+ ),
+ )
+ parser.add_argument(
+ "--mixed_precision",
+ type=str,
+ default=None,
+ choices=["no", "fp16", "bf16"],
+ help=(
+ "Whether to use mixed precision. Choose between fp16 and bf16 (bfloat16). Bf16 requires PyTorch >="
+ " 1.10.and an Nvidia Ampere GPU. Default to the value of accelerate config of the current system or the"
+ " flag passed with the `accelerate.launch` command. Use this argument to override the accelerate config."
+ ),
+ )
+ parser.add_argument(
+ "--report_to",
+ type=str,
+ default="wandb",
+ help=(
+ 'The integration to report the results and logs to. Supported platforms are `"tensorboard"`'
+ ' (default), `"wandb"` and `"comet_ml"`. Use `"all"` to report to all integrations.'
+ ),
+ )
+ parser.add_argument("--validation_prompts", type=str, nargs="*")
+ parser.add_argument(
+ "--resolution",
+ type=int,
+ default=512,
+ help=(
+ "The resolution for input images, all the images in the train/validation dataset will be resized to this"
+ " resolution"
+ ),
+ )
+ parser.add_argument("--split_vae_encode", type=int, required=False, default=None)
+ parser.add_argument("--min_masking_rate", type=float, default=0.0)
+ parser.add_argument("--cond_dropout_prob", type=float, default=0.0)
+ parser.add_argument("--max_grad_norm", default=50.0, type=float, help="Max gradient norm.", required=False)
+ parser.add_argument("--use_lora", action="store_true", help="Fine tune the model using LoRa")
+ parser.add_argument("--text_encoder_use_lora", action="store_true", help="Fine tune the model using LoRa")
+ parser.add_argument("--lora_r", default=16, type=int)
+ parser.add_argument("--lora_alpha", default=32, type=int)
+ parser.add_argument("--lora_target_modules", default=["to_q", "to_k", "to_v"], type=str, nargs="+")
+ parser.add_argument("--text_encoder_lora_r", default=16, type=int)
+ parser.add_argument("--text_encoder_lora_alpha", default=32, type=int)
+ parser.add_argument("--text_encoder_lora_target_modules", default=["to_q", "to_k", "to_v"], type=str, nargs="+")
+ parser.add_argument("--train_text_encoder", action="store_true")
+ parser.add_argument("--image_key", type=str, required=False)
+ parser.add_argument("--prompt_key", type=str, required=False)
+ parser.add_argument(
+ "--gradient_checkpointing",
+ action="store_true",
+ help="Whether or not to use gradient checkpointing to save memory at the expense of slower backward pass.",
+ )
+ parser.add_argument("--prompt_prefix", type=str, required=False, default=None)
+
+ # Video training specific arguments
+ # Video training only - model_type is always "video"
+ parser.add_argument(
+ "--num_frames",
+ type=int,
+ default=16,
+ help="Number of frames in the video (for video training only)",
+ )
+ parser.add_argument(
+ "--video_height",
+ type=int,
+ default=480,
+ help="Height of the video in pixels (for video training only)",
+ )
+ parser.add_argument(
+ "--video_width",
+ type=int,
+ default=848,
+ help="Width of the video in pixels (for video training only)",
+ )
+ parser.add_argument(
+ "--video_tokenizer_model_id",
+ type=str,
+ default="Cosmos-1.0-Tokenizer-DV8x16x16",
+ help="HuggingFace model ID for Cosmos video tokenizer (for video training only)",
+ )
+ parser.add_argument(
+ "--wan_pretrained_path",
+ type=str,
+ default=None,
+ help="Path or HuggingFace model ID to Wan pretrained weights. If provided, will load Wan weights into the backbone.",
+ )
+ parser.add_argument(
+ "--freeze_wan_backbone",
+ action="store_true",
+ help="Freeze Wan backbone weights (set lr=0). If set, --wan_backbone_lr_ratio is ignored.",
+ )
+ parser.add_argument(
+ "--wan_backbone_lr_ratio",
+ type=float,
+ default=0.1,
+ help="Learning rate ratio for Wan backbone relative to other parts (token_embedding, logits_head). Default: 0.1 (backbone lr = base_lr * 0.1). Ignored if --freeze_wan_backbone is set.",
+ )
+ parser.add_argument(
+ "--use_precomputed_features",
+ action="store_true",
+ help="Use pre-extracted features (video codes and text embeddings) instead of encoding on-the-fly.",
+ )
+ parser.add_argument(
+ "--use_precomputed_video_only",
+ action="store_true",
+ help="Use pre-extracted video codes only, encode text with UMT5-XXL at runtime.",
+ )
+ parser.add_argument(
+ "--features_dir",
+ type=str,
+ default=None,
+ help="Directory containing pre-extracted features (required if --use_precomputed_features is set).",
+ )
+ parser.add_argument(
+ "--empty_embeds_path",
+ type=str,
+ default=None,
+ help="Path to pre-extracted empty_embeds .pt file (required if --use_precomputed_features and --cond_dropout_prob > 0).",
+ )
+
+ args = parser.parse_args()
+
+ # Validate precomputed features arguments
+ if args.use_precomputed_features and args.use_precomputed_video_only:
+ raise ValueError("Cannot set both --use_precomputed_features and --use_precomputed_video_only")
+
+ if args.use_precomputed_features or args.use_precomputed_video_only:
+ if args.features_dir is None:
+ raise ValueError("--features_dir is required when --use_precomputed_features or --use_precomputed_video_only is set")
+ if not os.path.exists(args.features_dir):
+ raise ValueError(f"Features directory not found: {args.features_dir}")
+ # Check if empty_embeds is needed
+ if args.cond_dropout_prob > 0.0:
+ # Try to get empty_embeds_path from metadata.json if not provided
+ metadata_file = os.path.join(args.features_dir, "metadata.json")
+ if args.empty_embeds_path is None and os.path.exists(metadata_file):
+ import json
+ with open(metadata_file, 'r') as f:
+ metadata = json.load(f)
+ if metadata.get("empty_embeds_path"):
+ args.empty_embeds_path = os.path.join(args.features_dir, metadata["empty_embeds_path"])
+ # Use print instead of logger since Accelerator is not initialized yet
+ print(f"Found empty_embeds_path in metadata: {args.empty_embeds_path}")
+
+ if args.empty_embeds_path is None:
+ raise ValueError(
+ "--empty_embeds_path is required when --use_precomputed_features or --use_precomputed_video_only is set "
+ "and --cond_dropout_prob > 0.0. "
+ "Please run extract_features.py with --extract_text to generate the empty_embeds file."
+ )
+ if not os.path.exists(args.empty_embeds_path):
+ raise ValueError(f"Empty embeds file not found: {args.empty_embeds_path}")
+
+ if args.report_to == "wandb":
+ if not is_wandb_available():
+ raise ImportError("Make sure to install wandb if you want to use it for logging during training.")
+
+ num_datasources = sum(
+ [x is not None for x in [args.instance_data_dir, args.instance_data_image, args.instance_data_dataset]]
+ )
+
+ # if num_datasources != 1:
+ # raise ValueError(
+ # "provide one and only one of `--instance_data_dir`, `--instance_data_image`, or `--instance_data_dataset`"
+ # )
+
+ if args.instance_data_dir is not None:
+ if not os.path.exists(args.instance_data_dir):
+ raise ValueError(f"Does not exist: `--args.instance_data_dir` {args.instance_data_dir}")
+
+ if args.instance_data_image is not None:
+ if not os.path.exists(args.instance_data_image):
+ raise ValueError(f"Does not exist: `--args.instance_data_image` {args.instance_data_image}")
+
+ if args.instance_data_dataset is not None and (args.image_key is None or args.prompt_key is None):
+ raise ValueError("`--instance_data_dataset` requires setting `--image_key` and `--prompt_key`")
+
+ return args
+
+# _prepare_latent_image_ids removed - not used for video training
+
+def safe_unwrap_model(model, accelerator):
+ """
+ Safely unwrap model from accelerate/distributed wrapper, handling torch.compile.
+
+ The unwrapping order for a compiled + distributed model:
+ 1. DistributedDataParallel (DDP) → wraps OptimizedModule
+ 2. OptimizedModule (torch.compile) → has _orig_mod attribute
+ 3. Original model (WanDiscreteVideoTransformer)
+
+ Args:
+ model: The model (may be wrapped by accelerate and/or torch.compile)
+ accelerator: Accelerator instance
+
+ Returns:
+ The unwrapped model
+ """
+ unwrapped = model
+
+ # Step 1: Unwrap DDP/accelerate wrapper
+ try:
+ unwrapped = accelerator.unwrap_model(model)
+ except (KeyError, AttributeError):
+ pass
+
+ # Step 2: Handle DDP directly if accelerator.unwrap_model didn't work
+ if hasattr(unwrapped, 'module'):
+ unwrapped = unwrapped.module
+
+ # Step 3: Unwrap torch.compile (OptimizedModule has _orig_mod)
+ if hasattr(unwrapped, '_orig_mod'):
+ unwrapped = unwrapped._orig_mod
+
+ # Step 4: Handle nested _orig_mod (in case of multiple compile calls)
+ while hasattr(unwrapped, '_orig_mod'):
+ unwrapped = unwrapped._orig_mod
+
+ return unwrapped
+
+def main(args):
+ if args.allow_tf32:
+ torch.backends.cuda.matmul.allow_tf32 = True
+
+ logging_dir = Path(args.output_dir, args.logging_dir)
+
+ accelerator_project_config = ProjectConfiguration(project_dir=args.output_dir, logging_dir=logging_dir)
+
+ accelerator = Accelerator(
+ gradient_accumulation_steps=args.gradient_accumulation_steps,
+ mixed_precision=args.mixed_precision,
+ log_with=args.report_to,
+ project_config=accelerator_project_config,
+ )
+
+ if accelerator.is_main_process:
+ os.makedirs(args.output_dir, exist_ok=True)
+
+ # Make one log on every process with the configuration for debugging.
+ logging.basicConfig(
+ format="%(asctime)s - %(levelname)s - %(name)s - %(message)s",
+ datefmt="%m/%d/%Y %H:%M:%S",
+ level=logging.INFO,
+ )
+ logger.info(accelerator.state, main_process_only=False)
+
+ if accelerator.is_main_process:
+ accelerator.init_trackers("meissonic", config=vars(copy.deepcopy(args)))
+
+ if args.seed is not None:
+ set_seed(args.seed)
+
+ # Initialize text encoder and tokenizer for video training (T5/UMT5 only)
+ # Skip loading if using precomputed features (will load only during validation)
+ text_encoder = None
+ tokenizer = None
+
+ if args.use_precomputed_features:
+ logger.info("Using precomputed features - skipping text encoder and video tokenizer loading during training")
+ logger.info("Text encoder and video tokenizer will be loaded only during validation/inference")
+ elif args.use_precomputed_video_only:
+ logger.info("Using precomputed video codes only - will encode text with UMT5-XXL at runtime")
+ logger.info("Video tokenizer will be loaded only during validation/inference")
+ # Force text encoder architecture to umt5-xxl for this mode
+ if args.text_encoder_architecture != "umt5-xxl":
+ logger.info(f"Forcing text_encoder_architecture to 'umt5-xxl' for --use_precomputed_video_only mode (was '{args.text_encoder_architecture}')")
+ args.text_encoder_architecture = "umt5-xxl"
+
+ # Load text encoder and tokenizer for training (since we need to encode text at runtime)
+ if args.text_encoder_architecture in ["umt5-base", "umt5-xxl", "t5"]:
+ if args.resume_from_checkpoint:
+ text_encoder = T5EncoderModel.from_pretrained(
+ args.resume_from_checkpoint, subfolder="text_encoder", variant=args.variant
+ )
+ tokenizer = T5Tokenizer.from_pretrained(
+ args.resume_from_checkpoint, subfolder="tokenizer", variant=args.variant
+ )
+ else:
+ # Map architecture to model ID
+ if args.text_encoder_architecture == "umt5-base":
+ model_id = "google/umt5-base"
+ elif args.text_encoder_architecture == "umt5-xxl":
+ model_id = "google/umt5-xxl"
+ elif args.text_encoder_architecture == "t5":
+ model_id = "t5-base"
+
+ text_encoder = T5EncoderModel.from_pretrained(model_id, variant=args.variant)
+ tokenizer = T5Tokenizer.from_pretrained(model_id, variant=args.variant)
+ text_encoder.eval()
+ text_encoder.requires_grad_(False)
+ text_encoder.config.use_cache = False
+
+ text_encoder.to(accelerator.device)
+ logger.info(f"Loaded text encoder: {model_id} (d_model={text_encoder.config.d_model})")
+
+
+ # Get mask_token_id and codebook_size from metadata (required)
+ metadata_file = os.path.join(args.features_dir, "metadata.json")
+ if not os.path.exists(metadata_file):
+ raise ValueError(f"Metadata file not found: {metadata_file}. Please ensure features were extracted with extract_features.py.")
+
+ import json
+ with open(metadata_file, 'r') as f:
+ metadata = json.load(f)
+
+ codebook_size = metadata.get("codebook_size")
+ mask_token_id = metadata.get("mask_token_id")
+
+ if codebook_size is None or mask_token_id is None:
+ raise ValueError(
+ f"codebook_size and mask_token_id must be in metadata.json. "
+ f"Found: codebook_size={codebook_size}, mask_token_id={mask_token_id}. "
+ f"Please re-run extract_features.py to ensure metadata is complete."
+ )
+
+ logger.info(f"Loaded from metadata: codebook_size={codebook_size}, mask_token_id={mask_token_id}")
+
+ # Create a minimal object with just the attributes we need
+ class MinimalTokenizer:
+ def __init__(self, mask_token_id, codebook_size):
+ self.mask_token_id = mask_token_id
+ self.codebook_size = codebook_size
+ video_tokenizer = MinimalTokenizer(mask_token_id, codebook_size)
+ logger.info(f"Minimal tokenizer created: mask_token_id={mask_token_id}, codebook_size={codebook_size}")
+ else:
+ # Load text encoder and tokenizer normally
+ if args.text_encoder_architecture in ["umt5-base", "umt5-xxl", "t5"]:
+ if args.resume_from_checkpoint:
+ text_encoder = T5EncoderModel.from_pretrained(
+ args.resume_from_checkpoint, subfolder="text_encoder", variant=args.variant
+ )
+ tokenizer = T5Tokenizer.from_pretrained(
+ args.resume_from_checkpoint, subfolder="tokenizer", variant=args.variant
+ )
+ else:
+ # Map architecture to model ID
+ if args.text_encoder_architecture == "umt5-base":
+ model_id = "google/umt5-base"
+ elif args.text_encoder_architecture == "umt5-xxl":
+ model_id = "google/umt5-xxl"
+ elif args.text_encoder_architecture == "t5":
+ model_id = "t5-base" # or "google/t5-v1_1-base" depending on your needs
+ else:
+ raise ValueError(f"Unknown text encoder architecture: {args.text_encoder_architecture}")
+
+ # text_encoder = T5EncoderModel.from_pretrained(model_id)
+ # tokenizer = T5Tokenizer.from_pretrained(model_id)
+ text_encoder = T5EncoderModel.from_pretrained(model_path, local_files_only=True)
+ tokenizer = T5Tokenizer.from_pretrained(model_path, local_files_only=True)
+ logger.info(f"Loaded text encoder: {model_id} (d_model={text_encoder.config.d_model})")
+
+ # Initialize video tokenizer for video training
+ device = accelerator.device
+ dtype = torch.float32
+ if accelerator.mixed_precision == "fp16":
+ dtype = torch.float16
+ elif accelerator.mixed_precision == "bf16":
+ dtype = torch.bfloat16
+ text_encoder.to(accelerator.device)
+ tokenizer.to(accelerator.device)
+
+ video_tokenizer = CosmosVideoTokenizer(
+ model_id=args.video_tokenizer_model_id,
+ device=device,
+ dtype=dtype
+ )
+ video_tokenizer.requires_grad_(False)
+
+ if not args.use_precomputed_features and not args.use_precomputed_video_only and args.train_text_encoder:
+ if args.text_encoder_use_lora:
+ lora_config = LoraConfig(
+ r=args.text_encoder_lora_r,
+ lora_alpha=args.text_encoder_lora_alpha,
+ target_modules=args.text_encoder_lora_target_modules,
+ )
+ text_encoder.add_adapter(lora_config)
+ text_encoder.train()
+ text_encoder.requires_grad_(True)
+ else:
+ text_encoder.eval()
+ text_encoder.requires_grad_(False)
+
+ # Initialize video transformer model
+ if args.training_from_scratch:
+ # Calculate compressed dimensions based on Cosmos tokenizer
+ # Cosmos compresses: F' = F // 8, H' = H // 16, W' = W // 16
+ # However, actual encoding may have slight variations due to padding/rounding
+ # So we test with a dummy video to get the exact dimensions
+ if args.use_precomputed_features or args.use_precomputed_video_only:
+ # For precomputed features, get dimensions from metadata or sample file
+ logger.info("Getting compressed dimensions from precomputed features...")
+ metadata_file = os.path.join(args.features_dir, "metadata.json")
+
+ # Try to get dimensions from metadata first
+ F_prime, H_prime, W_prime = None, None, None
+ if os.path.exists(metadata_file):
+ import json
+ with open(metadata_file, 'r') as f:
+ metadata = json.load(f)
+ # Check if metadata has sample shape info
+ if 'samples' in metadata and len(metadata['samples']) > 0:
+ sample = metadata['samples'][0]
+ if 'video_code_shape' in sample:
+ shape = sample['video_code_shape']
+ if len(shape) == 3: # [F', H', W']
+ F_prime, H_prime, W_prime = shape[0], shape[1], shape[2]
+ logger.info(f"Got dimensions from metadata: F'={F_prime}, H'={H_prime}, W'={W_prime}")
+
+ # If not in metadata, load a sample file
+ # if F_prime is None or H_prime is None or W_prime is None:
+ # logger.info("Loading a sample file to get dimensions...")
+ # # Try to find the first available sample
+ # video_codes_dir = os.path.join(args.features_dir, "video_codes")
+ # sample_path = None
+ # for level1 in range(1000):
+ # level1_dir = os.path.join(video_codes_dir, f"{level1:03d}")
+ # if not os.path.exists(level1_dir):
+ # continue
+ # for level2 in range(1000):
+ # level2_dir = os.path.join(level1_dir, f"{level2:03d}")
+ # if not os.path.exists(level2_dir):
+ # continue
+ # for level3 in range(1000):
+ # level3_dir = os.path.join(level2_dir, f"{level3:03d}")
+ # if not os.path.exists(level3_dir):
+ # continue
+ # # Find first .npy file
+ # for filename in os.listdir(level3_dir):
+ # if filename.endswith('.npy'):
+ # sample_path = os.path.join(level3_dir, filename)
+ # break
+ # if sample_path:
+ # break
+ # if sample_path:
+ # break
+ # if sample_path:
+ # break
+
+ # if sample_path:
+ # import numpy as np
+ # sample_tokens = np.load(sample_path) # [F', H', W']
+ # F_prime, H_prime, W_prime = sample_tokens.shape[0], sample_tokens.shape[1], sample_tokens.shape[2]
+ # logger.info(f"Got dimensions from sample file: F'={F_prime}, H'={H_prime}, W'={W_prime}")
+ # else:
+ # raise FileNotFoundError(f"Could not find any sample files in {video_codes_dir} to determine dimensions")
+ else:
+ # For non-precomputed features, use tokenizer to encode dummy video
+ dummy_video = torch.zeros(1, 3, args.num_frames, args.video_height, args.video_width,
+ device=accelerator.device, dtype=torch.float32)
+ with torch.no_grad():
+ dummy_tokens = video_tokenizer.encode(dummy_video) # [1, F', H', W']
+ F_prime, H_prime, W_prime = dummy_tokens.shape[1], dummy_tokens.shape[2], dummy_tokens.shape[3]
+ logger.info(f"Actual compressed dimensions from tokenizer: F'={F_prime}, H'={H_prime}, W'={W_prime}")
+ logger.info(f"Theoretical dimensions: F'={args.num_frames // video_tokenizer.t_downsample}, "
+ f"H'={args.video_height // video_tokenizer.h_downsample}, "
+ f"W'={args.video_width // video_tokenizer.w_downsample}")
+
+ # Get text encoder dimension
+ if args.use_precomputed_features:
+ # For precomputed features, get text_dim from metadata or use default
+ text_dim_actual = None
+ metadata_file = os.path.join(args.features_dir, "metadata.json")
+ if os.path.exists(metadata_file):
+ import json
+ with open(metadata_file, 'r') as f:
+ metadata = json.load(f)
+ # Try to get from a sample
+ if 'samples' in metadata and len(metadata['samples']) > 0:
+ sample = metadata['samples'][0]
+ if 'text_embedding_shape' in sample:
+ shape = sample['text_embedding_shape']
+ if len(shape) == 2: # [L, D]
+ text_dim_actual = shape[1]
+ logger.info(f"Got text_dim from metadata: {text_dim_actual}")
+
+ # If not found, use default based on architecture
+ if text_dim_actual is None:
+ if args.text_encoder_architecture == "umt5-base":
+ text_dim_actual = 768
+ elif args.text_encoder_architecture == "umt5-xxl":
+ text_dim_actual = 4096
+ elif args.text_encoder_architecture == "t5":
+ text_dim_actual = 768
+ else:
+ text_dim_actual = 768 # default
+ logger.info(f"Using default text_dim for {args.text_encoder_architecture}: {text_dim_actual}")
+ elif args.use_precomputed_video_only:
+ # For video-only mode, use the actual text encoder dimension (UMT5-XXL)
+ text_dim_actual = text_encoder.config.d_model
+ logger.info(f"Using actual text encoder dimension for {args.text_encoder_architecture}: {text_dim_actual}")
+ else:
+ text_dim_actual = text_encoder.config.d_model
+
+ # If Wan pretrained path is provided, load config from it first
+ wan_config = None
+ if args.wan_pretrained_path is not None:
+ logger.info(f"Loading Wan config from: {args.wan_pretrained_path}")
+ try:
+ # Try to load WanModel config
+ # Try loading config.json directly
+ import json
+ config_path = os.path.join(args.wan_pretrained_path, "config.json")
+ if os.path.exists(config_path):
+ with open(config_path, 'r') as f:
+ wan_config_dict = json.load(f)
+ # Create a simple config object
+ from types import SimpleNamespace
+ wan_config = SimpleNamespace(**wan_config_dict)
+ else:
+ logger.warning(f"Could not find config in {args.wan_pretrained_path}, using default values")
+
+ if wan_config is not None:
+ logger.info(f"Loaded Wan config: dim={getattr(wan_config, 'dim', 'N/A')}, "
+ f"ffn_dim={getattr(wan_config, 'ffn_dim', 'N/A')}, "
+ f"num_layers={getattr(wan_config, 'num_layers', 'N/A')}, "
+ f"num_heads={getattr(wan_config, 'num_heads', 'N/A')}")
+ except Exception as e:
+ logger.warning(f"Failed to load Wan config: {e}, using default values")
+
+
+ # Use Wan config if available, otherwise use defaults
+ dim = getattr(wan_config, 'dim', 2048) if wan_config else 2048
+ ffn_dim = getattr(wan_config, 'ffn_dim', 8192) if wan_config else 8192
+ num_layers = getattr(wan_config, 'num_layers', 32) if wan_config else 32
+ num_heads = getattr(wan_config, 'num_heads', 16) if wan_config else 16
+ freq_dim = getattr(wan_config, 'freq_dim', 256) if wan_config else 256
+ in_dim = getattr(wan_config, 'in_dim', 16) if wan_config else 16
+ out_dim = getattr(wan_config, 'out_dim', 16) if wan_config else 16
+
+ # text_dim: Use Wan's text_dim if available, but warn if it doesn't match text encoder
+ wan_text_dim = getattr(wan_config, 'text_dim', None) if wan_config else None
+ if wan_text_dim is not None and wan_text_dim != text_dim_actual:
+ logger.warning(f"Wan config text_dim ({wan_text_dim}) doesn't match text encoder dimension ({text_dim_actual}). "
+ f"Will use text encoder dimension and skip loading text_embedding weights.")
+ text_dim_for_model = text_dim_actual
+ else:
+ # Use Wan's text_dim if it matches, or use text encoder dimension
+ text_dim_for_model = wan_text_dim if wan_text_dim is not None else text_dim_actual
+
+ model = WanDiscreteVideoTransformer(
+ codebook_size=video_tokenizer.codebook_size,
+ vocab_size=video_tokenizer.codebook_size + 1,
+ num_frames=F_prime,
+ height=H_prime,
+ width=W_prime,
+ model_type='t2v',
+ patch_size=(1, 2, 2),
+ text_len=512,
+ in_dim=in_dim,
+ dim=dim,
+ ffn_dim=ffn_dim,
+ freq_dim=freq_dim,
+ text_dim=text_dim_for_model,
+ out_dim=out_dim,
+ num_heads=num_heads,
+ num_layers=num_layers,
+ window_size=(-1, -1),
+ qk_norm=True,
+ cross_attn_norm=True,
+ eps=1e-6
+ )
+
+ # Load Wan pretrained weights into backbone if provided
+ if args.wan_pretrained_path is not None:
+ logger.info(f"Loading Wan pretrained weights from: {args.wan_pretrained_path}")
+ try:
+ # Check if it's a local path or HuggingFace model ID
+ is_local_path = os.path.exists(args.wan_pretrained_path) and os.path.isdir(args.wan_pretrained_path)
+
+ if is_local_path:
+ # Local path: find the state dict file
+ state_dict_path = None
+ possible_paths = [
+ os.path.join(args.wan_pretrained_path, "diffusion_pytorch_model.safetensors"),
+ os.path.join(args.wan_pretrained_path, "diffusion_pytorch_model.bin"),
+ os.path.join(args.wan_pretrained_path, "pytorch_model.bin"),
+ os.path.join(args.wan_pretrained_path, "model.safetensors"),
+ ]
+ for p in possible_paths:
+ if os.path.exists(p):
+ state_dict_path = p
+ break
+
+ if state_dict_path is None:
+ raise FileNotFoundError(f"Could not find state dict in {args.wan_pretrained_path}")
+
+ logger.info(f"Loading weights from local path: {state_dict_path}")
+
+ # Load state dict from local file
+ if state_dict_path.endswith('.safetensors'):
+ from safetensors import safe_open
+ wan_state_dict = {}
+ with safe_open(state_dict_path, framework="pt", device="cpu") as f:
+ for k in f.keys():
+ wan_state_dict[k] = f.get_tensor(k)
+ else:
+ wan_state_dict = torch.load(state_dict_path, map_location="cpu")
+ else:
+ # HuggingFace model ID: try to load using from_pretrained
+ logger.info(f"Loading weights from HuggingFace Hub: {args.wan_pretrained_path}")
+ try:
+ # Try loading as WanModel first
+ temp_model = WanModel.from_pretrained(
+ args.wan_pretrained_path,
+ subfolder=None,
+ low_cpu_mem_usage=False,
+ device_map=None
+ )
+ wan_state_dict = temp_model.state_dict()
+ del temp_model
+ except:
+ # If that fails, try with 'backbone' subfolder
+ try:
+ temp_model = WanModel.from_pretrained(
+ args.wan_pretrained_path,
+ subfolder="backbone",
+ low_cpu_mem_usage=False,
+ device_map=None
+ )
+ wan_state_dict = temp_model.state_dict()
+ del temp_model
+ except:
+ # Last resort: try to download and load state dict directly
+ from huggingface_hub import hf_hub_download
+ import tempfile
+ with tempfile.TemporaryDirectory() as tmpdir:
+ # Try different possible filenames
+ possible_files = [
+ "diffusion_pytorch_model.safetensors",
+ "diffusion_pytorch_model.bin",
+ "pytorch_model.bin",
+ "model.safetensors",
+ ]
+ state_dict_path = None
+ for filename in possible_files:
+ try:
+ state_dict_path = hf_hub_download(
+ repo_id=args.wan_pretrained_path,
+ filename=filename,
+ cache_dir=tmpdir
+ )
+ break
+ except:
+ continue
+
+ if state_dict_path is None:
+ raise FileNotFoundError(
+ f"Could not find state dict file in HuggingFace model {args.wan_pretrained_path}"
+ )
+
+ # Load state dict
+ if state_dict_path.endswith('.safetensors'):
+ from safetensors import safe_open
+ wan_state_dict = {}
+ with safe_open(state_dict_path, framework="pt", device="cpu") as f:
+ for k in f.keys():
+ wan_state_dict[k] = f.get_tensor(k)
+ else:
+ wan_state_dict = torch.load(state_dict_path, map_location="cpu")
+
+ # Remove text_embedding weights if input dimension doesn't match
+ # This is necessary when using a different text encoder (e.g., UMT5-base with 768 dim
+ # vs Wan's original 4096 dim)
+ # Check the first text_embedding layer's input dimension (text_embedding.0.weight shape[1])
+ text_embedding_key = 'text_embedding.0.weight'
+ if text_embedding_key in wan_state_dict:
+ pretrained_text_dim = wan_state_dict[text_embedding_key].shape[1] # Input dimension
+ model_text_dim = model.backbone.text_embedding[0].weight.shape[1] # Model's expected input dimension
+
+ if pretrained_text_dim != model_text_dim:
+ # Remove all text_embedding related keys
+ keys_to_remove = [k for k in wan_state_dict.keys() if 'text_embedding' in k]
+ for k in keys_to_remove:
+ del wan_state_dict[k]
+ logger.info(f"Removed {len(keys_to_remove)} text_embedding keys due to input dimension mismatch "
+ f"(pretrained: {pretrained_text_dim}, model: {model_text_dim})")
+
+ # Load into model's backbone
+ missing_keys, unexpected_keys = model.backbone.load_state_dict(wan_state_dict, strict=False)
+
+ # Log results
+ if missing_keys:
+ # Filter out expected missing keys (text_embedding if removed)
+ actual_missing = [k for k in missing_keys if 'text_embedding' not in k]
+ if actual_missing:
+ logger.warning(f"Missing keys when loading Wan weights: {actual_missing[:10]}..."
+ if len(actual_missing) > 10 else f"Missing keys: {actual_missing}")
+ else:
+ logger.info(f"Only text_embedding keys are missing (expected due to text_dim mismatch)")
+ if unexpected_keys:
+ logger.warning(f"Unexpected keys when loading Wan weights: {unexpected_keys[:10]}..."
+ if len(unexpected_keys) > 10 else f"Unexpected keys: {unexpected_keys}")
+
+ logger.info("✓ Successfully loaded Wan pretrained weights into backbone (excluding text_embedding)")
+
+ except Exception as e:
+ logger.warning(f"Failed to load Wan pretrained weights: {e}")
+ import traceback
+ traceback.print_exc()
+ logger.warning("Continuing with random initialization")
+ else:
+ # Load from pretrained checkpoint
+ model = WanDiscreteVideoTransformer.from_pretrained(
+ args.pretrained_model_name_or_path, subfolder="transformer", low_cpu_mem_usage=False, device_map=None
+ )
+
+ # Save vocab_size before torch.compile (for use in training loop)
+ # This avoids issues with accelerate.unwrap_model when using torch.compile
+ vocab_size = model.vocab_size
+
+ # Convert model to correct dtype before torch.compile
+ # This ensures all layers (especially text_embedding which is randomly initialized) are on the right dtype
+ if accelerator.mixed_precision == "fp16":
+ model = model.to(dtype=torch.float16)
+ elif accelerator.mixed_precision == "bf16":
+ model = model.to(dtype=torch.bfloat16)
+ # else: keep float32
+
+ model = torch.compile(model)
+
+ if args.use_lora:
+ lora_config = LoraConfig(
+ r=args.lora_r,
+ lora_alpha=args.lora_alpha,
+ target_modules=args.lora_target_modules,
+ )
+ model.add_adapter(lora_config)
+
+ model.train()
+
+ # Freeze Wan backbone if requested
+ if args.freeze_wan_backbone:
+ for name, param in model.named_parameters():
+ if 'backbone' in name:
+ param.requires_grad = False
+ logger.info("Wan backbone parameters are frozen (requires_grad=False)")
+
+ if args.gradient_checkpointing:
+ model.enable_gradient_checkpointing()
+ if args.train_text_encoder and not args.use_precomputed_features and not args.use_precomputed_video_only:
+ # Only enable gradient checkpointing for text_encoder if it's loaded
+ text_encoder.gradient_checkpointing_enable()
+ elif args.use_precomputed_video_only:
+ # For video-only mode, enable gradient checkpointing for text encoder to save memory
+ text_encoder.gradient_checkpointing_enable()
+ logger.info("Enabled gradient checkpointing for text encoder to save memory")
+
+ # EMA is not used for video training
+ ema = None
+
+ def save_model_hook(models, weights, output_dir):
+ if accelerator.is_main_process:
+ transformer_lora_layers_to_save = None
+ text_encoder_lora_layers_to_save = None
+
+ for model_ in models:
+ # Unwrap model_ to get the actual model type (handles torch.compile wrapping)
+ unwrapped_model_ = safe_unwrap_model(model_, accelerator)
+
+ # Use class name comparison for more robust type checking
+ # This handles cases where the same class might be loaded from different modules
+ model_class_name = unwrapped_model_.__class__.__name__
+
+ if model_class_name == "WanDiscreteVideoTransformer":
+ if args.use_lora:
+ transformer_lora_layers_to_save = get_peft_model_state_dict(model_)
+ else:
+ # Unwrap before saving to avoid torch.compile issues
+ unwrapped_model_.save_pretrained(os.path.join(output_dir, "transformer"))
+ elif model_class_name in ["T5EncoderModel", "T5Model"]:
+ if args.text_encoder_use_lora:
+ text_encoder_lora_layers_to_save = get_peft_model_state_dict(model_)
+ else:
+ # Unwrap before saving to avoid torch.compile issues
+ unwrapped_model_.save_pretrained(os.path.join(output_dir, "text_encoder"))
+ else:
+ raise ValueError(f"unexpected save model: {model_.__class__}, unwrapped: {unwrapped_model_.__class__.__name__}")
+
+ # make sure to pop weight so that corresponding model is not saved again
+ weights.pop()
+
+ if transformer_lora_layers_to_save is not None or text_encoder_lora_layers_to_save is not None:
+ LoraLoaderMixin.save_lora_weights(
+ output_dir,
+ unet_lora_layers=transformer_lora_layers_to_save,
+ text_encoder_lora_layers=text_encoder_lora_layers_to_save,
+ )
+
+ # EMA not used for video training
+
+ def load_model_hook(models, input_dir):
+ transformer = None
+ text_encoder_ = None
+
+ # this part is added for keep consistency when add model.compile() in the model
+ def adap_compile(ori_dict):#add '_orig_mod.' to each key
+ new_dict = {}
+ for k,v in ori_dict.items():
+ new_dict['_orig_mod.'+k] = v
+ return new_dict
+
+ while len(models) > 0:
+ model_ = models.pop()
+
+ # Unwrap model to get the actual class name
+ unwrapped_model_ = safe_unwrap_model(model_, accelerator)
+ model_class_name = unwrapped_model_.__class__.__name__
+
+ if model_class_name == "WanDiscreteVideoTransformer":
+ if args.use_lora:
+ transformer = model_
+ else:
+ load_model = WanDiscreteVideoTransformer.from_pretrained(os.path.join(input_dir, "transformer"), low_cpu_mem_usage=False, device_map=None)
+ model_.load_state_dict(adap_compile(load_model.state_dict()))
+ del load_model
+ elif model_class_name in ["T5EncoderModel", "T5Model"]:
+ if args.text_encoder_use_lora:
+ text_encoder_ = model_
+ else:
+ try:
+ load_model = T5EncoderModel.from_pretrained(os.path.join(input_dir, "text_encoder"))
+ model_.load_state_dict(load_model.state_dict())
+ except:
+ print('Not found text-encoder model in current folder. Loading default UMT5-base.')
+ load_model = T5EncoderModel.from_pretrained("google/umt5-base")
+ model_.load_state_dict(load_model.state_dict())
+ del load_model
+ else:
+ raise ValueError(f"unexpected load model: {model_.__class__}, unwrapped: {model_class_name}")
+
+ if transformer is not None or text_encoder_ is not None:
+ lora_state_dict, network_alphas = LoraLoaderMixin.lora_state_dict(input_dir)
+ LoraLoaderMixin.load_lora_into_text_encoder(
+ lora_state_dict, network_alphas=network_alphas, text_encoder=text_encoder_
+ )
+ LoraLoaderMixin.load_lora_into_transformer(
+ lora_state_dict, network_alphas=network_alphas, transformer=transformer
+ )
+
+ # EMA not used for video training
+
+ accelerator.register_load_state_pre_hook(load_model_hook)
+ accelerator.register_save_state_pre_hook(save_model_hook)
+
+ if args.scale_lr:
+ args.learning_rate = (
+ args.learning_rate * args.train_batch_size * accelerator.num_processes * args.gradient_accumulation_steps
+ )
+
+ if args.use_8bit_adam:
+ try:
+ import bitsandbytes as bnb
+ except ImportError:
+ raise ImportError(
+ "Please install bitsandbytes to use 8-bit Adam. You can do so by running `pip install bitsandbytes`"
+ )
+
+ optimizer_cls = bnb.optim.AdamW8bit
+ else:
+ optimizer_cls = torch.optim.AdamW
+
+ # Separate Wan backbone parameters from other parameters (token_embedding, logits_head)
+ # This allows different learning rates for backbone vs head/tail
+ backbone_params = []
+ other_params = []
+
+ for name, param in model.named_parameters():
+ if 'backbone' in name:
+ backbone_params.append((name, param))
+ else:
+ other_params.append((name, param))
+
+ # Log parameter counts
+ backbone_param_count = sum(p.numel() for _, p in backbone_params)
+ other_param_count = sum(p.numel() for _, p in other_params)
+ total_param_count = sum(p.numel() for _, p in model.named_parameters())
+ logger.info(f"Parameter counts: backbone={backbone_param_count:,}, other={other_param_count:,}, total={total_param_count:,}")
+
+ # no decay on bias and layernorm and embedding
+ no_decay = ["bias", "layer_norm.weight", "mlm_ln.weight", "embeddings.weight"]
+
+ # Calculate backbone lr
+ if args.freeze_wan_backbone:
+ backbone_lr = 0.0
+ logger.info("Wan backbone is frozen (lr=0)")
+ else:
+ backbone_lr = args.learning_rate * args.wan_backbone_lr_ratio
+ logger.info(f"Wan backbone lr = {backbone_lr:.6f} (base_lr * {args.wan_backbone_lr_ratio})")
+
+ logger.info(f"Other parts (token_embedding, logits_head) lr = {args.learning_rate:.6f}")
+
+ # Group parameters: backbone and other parts, each with decay and no_decay
+ optimizer_grouped_parameters = []
+
+ # Backbone parameters with weight decay
+ if backbone_params:
+ backbone_with_decay = [p for n, p in backbone_params if not any(nd in n for nd in no_decay)]
+ if backbone_with_decay:
+ optimizer_grouped_parameters.append({
+ "params": backbone_with_decay,
+ "lr": backbone_lr,
+ "weight_decay": args.adam_weight_decay,
+ })
+
+ # Backbone parameters without weight decay
+ backbone_no_decay = [p for n, p in backbone_params if any(nd in n for nd in no_decay)]
+ if backbone_no_decay:
+ optimizer_grouped_parameters.append({
+ "params": backbone_no_decay,
+ "lr": backbone_lr,
+ "weight_decay": 0.0,
+ })
+
+ # Other parameters (token_embedding, logits_head) with weight decay
+ if other_params:
+ other_with_decay = [p for n, p in other_params if not any(nd in n for nd in no_decay)]
+ if other_with_decay:
+ optimizer_grouped_parameters.append({
+ "params": other_with_decay,
+ "lr": args.learning_rate,
+ "weight_decay": args.adam_weight_decay,
+ })
+
+ # Other parameters without weight decay
+ other_no_decay = [p for n, p in other_params if any(nd in n for nd in no_decay)]
+ if other_no_decay:
+ optimizer_grouped_parameters.append({
+ "params": other_no_decay,
+ "lr": args.learning_rate,
+ "weight_decay": 0.0,
+ })
+
+ if args.train_text_encoder and not args.use_precomputed_features and not args.use_precomputed_video_only:
+ # Only add text_encoder to optimizer if it's loaded and not using precomputed features
+ optimizer_grouped_parameters.append(
+ {"params": text_encoder.parameters(), "weight_decay": args.adam_weight_decay}
+ )
+
+ optimizer = optimizer_cls(
+ optimizer_grouped_parameters,
+ lr=args.learning_rate,
+ betas=(args.adam_beta1, args.adam_beta2),
+ weight_decay=args.adam_weight_decay,
+ eps=args.adam_epsilon,
+ )
+
+ logger.info("Creating dataloaders and lr_scheduler")
+
+ total_batch_size = args.train_batch_size * accelerator.num_processes * args.gradient_accumulation_steps
+
+ # Video training datasets
+ if args.use_precomputed_features:
+ # Use pre-extracted features
+ logger.info(f"Using pre-extracted features from: {args.features_dir}")
+ dataset = PrecomputedFeatureDataset(
+ features_dir=args.features_dir,
+ )
+ elif args.use_precomputed_video_only:
+ # Use pre-extracted video codes only, process text at runtime
+ logger.info(f"Using pre-extracted video codes from: {args.features_dir}")
+ logger.info("Text will be encoded with UMT5-XXL at runtime")
+ # We'll need to create a dataset that loads video codes but processes text
+ # For now, let's create a modified version of PrecomputedFeatureDataset
+ dataset = PrecomputedVideoOnlyDataset(
+ features_dir=args.features_dir,
+ )
+ # Set up tokenizer for dataset
+ dataset.tokenizer = tokenizer
+ dataset.text_encoder_architecture = args.text_encoder_architecture
+ elif args.instance_dataset == 'OpenVid1MDataset':
+ # OpenVid1M dataset from CSV file
+ csv_path = args.instance_data_dir
+ if not os.path.exists(csv_path):
+ raise FileNotFoundError(f"CSV file not found: {csv_path}")
+
+ # Video root directory: assume videos are in the same directory as CSV or in a 'video_reorg' subdirectory
+ csv_dir = os.path.dirname(csv_path)
+ # Try to find video directory
+ if os.path.exists(os.path.join(csv_dir, 'video_reorg')):
+ video_root_dir = os.path.join(csv_dir, 'video_reorg')
+ elif os.path.exists(os.path.join(os.path.dirname(csv_dir), 'video_reorg')):
+ video_root_dir = os.path.join(os.path.dirname(csv_dir), 'video_reorg')
+ else:
+ # Fallback: use CSV directory
+ video_root_dir = csv_dir
+ logger.warning(f"Video directory not found, using CSV directory: {video_root_dir}")
+
+ dataset = OpenVid1MDataset(
+ csv_path=csv_path,
+ video_root_dir=video_root_dir,
+ tokenizer=tokenizer,
+ num_frames=args.num_frames,
+ height=args.video_height,
+ width=args.video_width,
+ text_encoder_architecture=args.text_encoder_architecture,
+ prompt_prefix=args.prompt_prefix,
+ )
+ elif args.instance_dataset == 'HuggingFaceDataset' or args.instance_dataset == 'VideoDataset':
+ dataset = VideoDataset(
+ hf_dataset=load_dataset(args.instance_data_dir, split="train"),
+ tokenizer=tokenizer,
+ video_key=args.image_key if args.image_key else "video",
+ prompt_key=args.prompt_key if args.prompt_key else "caption",
+ prompt_prefix=args.prompt_prefix,
+ num_frames=args.num_frames,
+ height=args.video_height,
+ width=args.video_width,
+ text_encoder_architecture=args.text_encoder_architecture
+ )
+ else:
+ raise ValueError(f"For video training, instance_dataset must be 'OpenVid1MDataset', 'HuggingFaceDataset' or 'VideoDataset', got '{args.instance_dataset}'")
+
+ # Adjust DataLoader settings for precomputed features to reduce memory usage
+ if args.use_precomputed_features or args.use_precomputed_video_only:
+ # For precomputed features, reduce prefetch to save memory
+ # Features are already extracted, so we don't need as much prefetching
+ prefetch_factor = min(args.dataloader_prefetch_factor, 1) if args.dataloader_num_workers > 0 else None
+ # Consider disabling pin_memory for precomputed features if memory is tight
+ # pin_memory=True is faster but uses more memory
+ pin_memory = True # Keep True for performance, but can be set to False if OOM persists
+ logger.info(f"Using precomputed features - DataLoader settings: prefetch_factor={prefetch_factor}, pin_memory={pin_memory}")
+ else:
+ prefetch_factor = args.dataloader_prefetch_factor if args.dataloader_num_workers > 0 else None
+ pin_memory = True
+
+ train_dataloader = DataLoader(
+ dataset,
+ batch_size=args.train_batch_size,
+ shuffle=True,
+ num_workers=args.dataloader_num_workers,
+ collate_fn=default_collate,
+ pin_memory=pin_memory,
+ prefetch_factor=prefetch_factor,
+ persistent_workers=args.dataloader_num_workers > 0, # Keep workers alive between epochs
+ )
+ train_dataloader.num_batches = len(train_dataloader)
+
+ # Log dataloader configuration for performance monitoring
+ if accelerator.is_main_process:
+ logger.info(f"Dataloader configuration:")
+ logger.info(f" - num_workers: {args.dataloader_num_workers} (0 = single-threaded, recommended: 4-8 for video)")
+ logger.info(f" - prefetch_factor: {args.dataloader_prefetch_factor if args.dataloader_num_workers > 0 else 'N/A (num_workers=0)'}")
+ logger.info(f" - persistent_workers: {args.dataloader_num_workers > 0}")
+ logger.info(f" - pin_memory: True")
+ if args.dataloader_num_workers == 0:
+ logger.warning(
+ "⚠️ num_workers=0 may cause GPU underutilization. "
+ "Consider setting --dataloader_num_workers 4-8 to improve GPU utilization."
+ )
+
+ # Calculate max_train_steps if not provided
+ if args.max_train_steps is None:
+ # Default to 1 epoch if not specified
+ num_update_steps_per_epoch = math.ceil(train_dataloader.num_batches / args.gradient_accumulation_steps)
+ args.max_train_steps = num_update_steps_per_epoch
+ logger.warning(f"max_train_steps not specified, defaulting to 1 epoch ({args.max_train_steps} steps)")
+
+ lr_scheduler = diffusers.optimization.get_scheduler(
+ args.lr_scheduler,
+ optimizer=optimizer,
+ num_training_steps=args.max_train_steps * accelerator.num_processes,
+ num_warmup_steps=args.lr_warmup_steps * accelerator.num_processes,
+ )
+
+ logger.info("Preparing model, optimizer and dataloaders")
+
+ if args.use_precomputed_features:
+ # Don't prepare text_encoder if using precomputed features
+ model, optimizer, lr_scheduler, train_dataloader = accelerator.prepare(
+ model, optimizer, lr_scheduler, train_dataloader
+ )
+ elif args.use_precomputed_video_only:
+ # Prepare text_encoder since we need it for training
+ model, optimizer, lr_scheduler, train_dataloader = accelerator.prepare(
+ model, optimizer, lr_scheduler, train_dataloader
+ )
+ elif args.train_text_encoder:
+ model, optimizer, lr_scheduler, train_dataloader, text_encoder = accelerator.prepare(
+ model, optimizer, lr_scheduler, train_dataloader, text_encoder
+ )
+ else:
+ model, optimizer, lr_scheduler, train_dataloader = accelerator.prepare(
+ model, optimizer, lr_scheduler, train_dataloader
+ )
+
+ train_dataloader.num_batches = len(train_dataloader)
+
+ weight_dtype = torch.float32
+ if accelerator.mixed_precision == "fp16":
+ weight_dtype = torch.float16
+ elif accelerator.mixed_precision == "bf16":
+ weight_dtype = torch.bfloat16
+
+ if not args.use_precomputed_features and not args.use_precomputed_video_only:
+ if not args.train_text_encoder:
+ text_encoder.to(device=accelerator.device, dtype=weight_dtype)
+ # Video tokenizer is already on the correct device
+ elif args.use_precomputed_video_only:
+ # Text encoder is already prepared by accelerator
+ logger.info("Text encoder prepared by accelerator for video-only precomputed mode")
+ else:
+ # For precomputed features, text_encoder is None, so skip
+ logger.info("Skipping text_encoder.to() - using precomputed features")
+
+ # EMA not used for video training
+
+ if not args.use_precomputed_features and not args.use_precomputed_video_only:
+ with nullcontext() if args.train_text_encoder else torch.no_grad():
+ # T5/UMT5 doesn't have cond_embeds, only encoder_hidden_states
+ input_ids_empty, attention_mask_empty = tokenize_prompt(tokenizer, "", args.text_encoder_architecture)
+ empty_embeds, _, _ = encode_prompt(
+ text_encoder, input_ids_empty.to(accelerator.device, non_blocking=True), args.text_encoder_architecture, attention_mask_empty.to(accelerator.device, non_blocking=True)
+ )
+ empty_clip_embeds = None # Not used for T5
+
+ # Video training doesn't use instance_data_image
+ elif args.use_precomputed_video_only:
+ # Generate empty_embeds at runtime using loaded text encoder
+ with torch.no_grad():
+ input_ids_empty, attention_mask_empty = tokenize_prompt(tokenizer, "", args.text_encoder_architecture)
+ empty_embeds, _, _ = encode_prompt(
+ text_encoder, input_ids_empty.to(accelerator.device, non_blocking=True), args.text_encoder_architecture, attention_mask_empty.to(accelerator.device, non_blocking=True)
+ )
+ logger.info(f"Generated empty_embeds at runtime: shape={empty_embeds.shape}, dtype={empty_embeds.dtype}")
+ else:
+ # For precomputed features, load empty_embeds from file if needed
+ empty_clip_embeds = None # Not used for T5
+ if args.cond_dropout_prob > 0.0:
+ if args.empty_embeds_path is None:
+ raise ValueError(
+ "--empty_embeds_path is required when --use_precomputed_features is set "
+ "and --cond_dropout_prob > 0.0"
+ )
+ logger.info(f"Loading empty_embeds from: {args.empty_embeds_path}")
+
+ # Load empty_embeds from .npy file (more space-efficient than .pt)
+ import numpy as np
+ empty_embeds_np = np.load(args.empty_embeds_path) # Load as numpy array
+ empty_embeds = torch.from_numpy(empty_embeds_np).to(dtype=weight_dtype) # Convert to tensor
+
+ # Load metadata to verify compatibility
+ metadata_file = os.path.join(args.features_dir, "metadata.json")
+ if os.path.exists(metadata_file):
+ import json
+ with open(metadata_file, 'r') as f:
+ metadata = json.load(f)
+ logger.info(f"Empty embeds info from metadata: shape={metadata.get('empty_embeds_shape')}")
+ # Verify text encoder architecture matches
+ if metadata.get("text_encoder_architecture") != args.text_encoder_architecture:
+ logger.warning(
+ f"Text encoder architecture mismatch: "
+ f"empty_embeds was extracted with {metadata.get('text_encoder_architecture')}, "
+ f"but training uses {args.text_encoder_architecture}"
+ )
+
+ # Note: We don't move to device here, will move during training loop if needed
+ logger.info(f"Loaded empty_embeds: shape={empty_embeds.shape}, dtype={empty_embeds.dtype}")
+ else:
+ empty_embeds = None
+ logger.info("Skipping empty_embeds loading - cond_dropout_prob is 0.0")
+
+ # We need to recalculate our total training steps as the size of the training dataloader may have changed.
+ num_update_steps_per_epoch = math.ceil(train_dataloader.num_batches / args.gradient_accumulation_steps)
+ # Afterwards we recalculate our number of training epochs.
+ # Note: We are not doing epoch based training here, but just using this for book keeping and being able to
+ # reuse the same training loop with other datasets/loaders.
+ # max_train_steps should already be set above if it was None
+ num_train_epochs = math.ceil(args.max_train_steps / num_update_steps_per_epoch) if num_update_steps_per_epoch > 0 else 1
+
+ # Train!
+ logger.info("***** Running training *****")
+ logger.info(f" Num training steps = {args.max_train_steps}")
+ logger.info(f" Instantaneous batch size per device = { args.train_batch_size}")
+ logger.info(f" Total train batch size (w. parallel, distributed & accumulation) = {total_batch_size}")
+ logger.info(f" Gradient Accumulation steps = {args.gradient_accumulation_steps}")
+
+ resume_from_checkpoint = args.resume_from_checkpoint
+ if resume_from_checkpoint:
+ if resume_from_checkpoint == "latest":
+ # Get the most recent checkpoint
+ dirs = os.listdir(args.output_dir)
+ dirs = [d for d in dirs if d.startswith("checkpoint")]
+ dirs = sorted(dirs, key=lambda x: int(x.split("-")[1]))
+ if len(dirs) > 0:
+ resume_from_checkpoint = os.path.join(args.output_dir, dirs[-1])
+ else:
+ resume_from_checkpoint = None
+
+ if resume_from_checkpoint is None:
+ accelerator.print(
+ f"Checkpoint '{args.resume_from_checkpoint}' does not exist. Starting a new training run."
+ )
+ else:
+ accelerator.print(f"Resuming from checkpoint {resume_from_checkpoint}")
+
+ if resume_from_checkpoint is None:
+ global_step = 0
+ first_epoch = 0
+ else:
+ accelerator.load_state(resume_from_checkpoint)
+ global_step = int(os.path.basename(resume_from_checkpoint).split("-")[1])
+ first_epoch = global_step // num_update_steps_per_epoch
+
+ # EMA not used for video training
+
+ # As stated above, we are not doing epoch based training here, but just using this for book keeping and being able to
+ # reuse the same training loop with other datasets/loaders.
+ for epoch in range(first_epoch, num_train_epochs):
+ for batch in train_dataloader:
+ with torch.no_grad():
+ if args.use_precomputed_features:
+ # Use pre-extracted features (both video and text)
+ # Features are already on CPU with correct dtypes (video_codes: int32, text_embedding: float16/bfloat16)
+ # Move to device and convert dtype in one step to avoid intermediate copies
+ weight_dtype = {
+ "fp16": torch.float16,
+ "bf16": torch.bfloat16,
+ }.get(accelerator.mixed_precision, torch.float32)
+
+ video_tokens = batch["video_codes"].to(
+ device=accelerator.device,
+ non_blocking=False, # CPU->GPU
+ ) # [B, F', H', W'], int32/int64, on GPU
+
+ encoder_hidden_states = batch["text_embedding"].to(
+ device=accelerator.device,
+ dtype=weight_dtype,
+ non_blocking=False,
+ ) # [B, L, D], float16/bfloat16, on GPU
+
+ # Get context lengths for precomputed features
+ # batch["context_len"] is a tensor from DataLoader collation [B]
+ if "context_len" in batch and batch["context_len"] is not None:
+ context_lens = batch["context_len"].to(dtype=torch.long, device=accelerator.device)
+ else:
+ # Fallback: assume all sequences have full length (512 for T5)
+ context_lens = torch.full((batch_size,), 512, dtype=torch.long, device=accelerator.device)
+ elif args.use_precomputed_video_only:
+ # Use pre-extracted video codes only, encode text at runtime
+ video_tokens = batch["video_codes"].to(
+ device=accelerator.device,
+ non_blocking=False, # CPU->GPU
+ ) # [B, F', H', W'], int32/int64, on GPU
+
+ # Encode text at runtime with UMT5-XXL
+ prompt_input_ids = batch["prompt_input_ids"].to(accelerator.device, non_blocking=True)
+ prompt_attention_mask = batch["prompt_attention_mask"].to(accelerator.device, non_blocking=True)
+
+ encoder_hidden_states, cond_embeds, context_lens = encode_prompt(
+ text_encoder, prompt_input_ids, args.text_encoder_architecture, prompt_attention_mask
+ )
+
+ del prompt_input_ids, prompt_attention_mask
+ if cond_embeds is not None:
+ del cond_embeds
+
+ batch_size = video_tokens.shape[0]
+ # with torch.no_grad():
+ # t_min = video_tokens.min().item()
+ # t_max = video_tokens.max().item()
+ # # if accelerator.is_main_process:
+ # # print(f"[DEBUG] tokens range = [{t_min}, {t_max}]")
+ # # try:
+ # # vocab = model.backbone.token_embed.num_embeddings
+ # # except Exception:
+ # # vocab = 16000
+ # # print(f"[DEBUG] model vocab_size = {vocab}")
+
+ # assert t_min >= 0, f"Found negative token id: {t_min}"
+
+ # assert t_max < 16000, f"Found token id {t_max} >= vocab_size 16000"
+ # Delete batch after moving data to GPU to free CPU memory
+ del batch
+
+ # Debug print (only on main process, first batch)
+ if accelerator.is_main_process and global_step == 0 and epoch == first_epoch:
+ print(f"[DEBUG] video_tokens: shape={video_tokens.shape}, dtype={video_tokens.dtype}, device={video_tokens.device}")
+ print(f"[DEBUG] encoder_hidden_states: shape={encoder_hidden_states.shape}, dtype={encoder_hidden_states.dtype}, device={encoder_hidden_states.device}")
+ else:
+ # Video training path - encode on-the-fly
+ video_values = batch["video"].to(accelerator.device, non_blocking=True) # [B, C, F, H, W]
+ batch_size = video_values.shape[0]
+
+ # Encode video to discrete tokens using CosmosVideoTokenizer
+ split_batch_size = args.split_vae_encode if args.split_vae_encode is not None else batch_size
+ num_splits = math.ceil(batch_size / split_batch_size)
+ video_tokens = []
+ for i in range(num_splits):
+ start_idx = i * split_batch_size
+ end_idx = min((i + 1) * split_batch_size, batch_size)
+ # video_values: [B, C, F, H, W]
+ tokens = video_tokenizer.encode(video_values[start_idx:end_idx]) # [B, F', H', W']
+ video_tokens.append(tokens)
+ video_tokens = torch.cat(video_tokens, dim=0) # [B, F', H', W']
+
+ if "prompt_input_ids" in batch:
+ with nullcontext() if args.train_text_encoder else torch.no_grad():
+ # Get attention_mask from batch or assume full length
+ attention_mask = batch.get("prompt_attention_mask", None)
+ if attention_mask is not None:
+ attention_mask = attention_mask.to(accelerator.device, non_blocking=True)
+ encoder_hidden_states, cond_embeds, context_lens = encode_prompt(
+ text_encoder, batch["prompt_input_ids"].to(accelerator.device, non_blocking=True), args.text_encoder_architecture, attention_mask
+ )
+
+ # Flatten video tokens for masking: [B, F', H', W'] -> [B, F'*H'*W']
+ B, F_prime, H_prime, W_prime = video_tokens.shape
+ seq_len = F_prime * H_prime * W_prime
+ video_tokens_flat = video_tokens.view(B, seq_len) # [B, seq_len]
+
+ timesteps = torch.rand(batch_size, device=video_tokens_flat.device)
+ mask_prob = torch.cos(timesteps * math.pi * 0.5)
+ mask_prob = mask_prob.clip(args.min_masking_rate)
+
+ num_token_masked = (seq_len * mask_prob).round().clamp(min=1)
+ batch_randperm = torch.rand(batch_size, seq_len, device=video_tokens_flat.device).argsort(dim=-1)
+ mask = batch_randperm < num_token_masked.unsqueeze(-1)
+
+ mask_id = video_tokenizer.mask_token_id # codebook_size
+ input_ids_flat = torch.where(mask, mask_id, video_tokens_flat)
+ labels_flat = torch.where(mask, video_tokens_flat, -100)
+
+ # Reshape back to [B, F', H', W'] for model forward
+ input_ids = input_ids_flat.view(B, F_prime, H_prime, W_prime)
+ labels = labels_flat.view(B, F_prime, H_prime, W_prime)
+
+ if args.cond_dropout_prob > 0.0:
+ assert encoder_hidden_states is not None
+ assert empty_embeds is not None, "empty_embeds must be loaded when cond_dropout_prob > 0.0"
+
+ batch_size = encoder_hidden_states.shape[0]
+
+ # Move empty_embeds to device if needed (for precomputed features case)
+ if empty_embeds.device != encoder_hidden_states.device:
+ empty_embeds = empty_embeds.to(encoder_hidden_states.device)
+
+ mask = (
+ torch.zeros((batch_size, 1, 1), device=encoder_hidden_states.device).float().uniform_(0, 1)
+ < args.cond_dropout_prob
+ )
+
+ empty_embeds_ = empty_embeds.expand(batch_size, -1, -1)
+ encoder_hidden_states = torch.where(
+ (encoder_hidden_states * mask).bool(), encoder_hidden_states, empty_embeds_
+ )
+
+ # Handle cond_embeds dropout (only for CLIP, not for T5)
+ # For T5/UMT5, cond_embeds is None, so skip this step
+
+ # Video tokens are already in [B, F', H', W'] format, no need to reshape
+
+ # if not args.use_precomputed_features and "prompt_input_ids" in batch:
+ # with nullcontext() if args.train_text_encoder else torch.no_grad():
+ # # Get attention_mask from batch or assume full length
+ # attention_mask = batch.get("prompt_attention_mask", None)
+ # if attention_mask is not None:
+ # attention_mask = attention_mask.to(accelerator.device, non_blocking=True)
+ # encoder_hidden_states, cond_embeds, context_lens = encode_prompt(
+ # text_encoder, batch["prompt_input_ids"].to(accelerator.device, non_blocking=True), args.text_encoder_architecture, attention_mask
+ # )
+
+ # Train Step
+ with accelerator.accumulate(model):
+ # Video training: use WanDiscreteVideoTransformer
+ # vocab_size is already saved before torch.compile
+
+ # Prepare timesteps: [B] -> [B] (scalar timesteps for video)
+ timesteps_tensor = (mask_prob * 1000).long().to(input_ids.device)
+
+ # Ensure encoder_hidden_states is on correct dtype
+ # Note: For precomputed features, dtype is already converted above
+ if encoder_hidden_states is not None and not args.use_precomputed_features:
+ # Only convert if not using precomputed features (already converted above)
+ if accelerator.mixed_precision == "fp16" and encoder_hidden_states.dtype != torch.float16:
+ encoder_hidden_states = encoder_hidden_states.to(dtype=torch.float16)
+ elif accelerator.mixed_precision == "bf16" and encoder_hidden_states.dtype != torch.bfloat16:
+ encoder_hidden_states = encoder_hidden_states.to(dtype=torch.bfloat16)
+ elif accelerator.mixed_precision == "no" and encoder_hidden_states.dtype != torch.float32:
+ encoder_hidden_states = encoder_hidden_states.to(dtype=torch.float32)
+
+ # Forward pass: input_ids is [B, F', H', W'], encoder_hidden_states is [B, L, D]
+ logits = model(
+ tokens=input_ids, # [B, F', H', W']
+ timesteps=timesteps_tensor, # [B]
+ encoder_hidden_states=encoder_hidden_states, # [B, L, D]
+ context_lens=context_lens, # [B] - effective text lengths
+ y=None,
+ ) # Returns [B, vocab_size, F', H', W']
+
+ # Reshape logits and labels for loss computation
+ # logits: [B, vocab_size, F', H', W'] -> [B*F'*H'*W', vocab_size]
+ B, vocab_size, F_prime_logits, H_prime_logits, W_prime_logits = logits.shape
+ # logits = logits.permute(0, 2, 3, 4, 1).reshape(B * F_prime_logits * H_prime_logits * W_prime_logits, vocab_size)
+
+ # # labels: [B, F', H', W'] - may have different dimensions due to patch/unpatch operations
+ # # Crop labels to match logits dimensions if needed
+ # B_labels, F_prime_labels, H_prime_labels, W_prime_labels = labels.shape
+ # assert B == B_labels, f"Batch size mismatch: logits {B} vs labels {B_labels}"
+
+ # # # Crop labels to match logits spatial dimensions
+ # # if F_prime_labels != F_prime_logits or H_prime_labels != H_prime_logits or W_prime_labels != W_prime_logits:
+ # # # Crop labels to match logits dimensions
+ # # labels = labels[:, :F_prime_logits, :H_prime_logits, :W_prime_logits]
+
+ # # labels: [B, F', H', W'] -> [B*F'*H'*W']
+ # labels_flat = labels.reshape(-1)
+
+ # # Convert to long (int64) for cross_entropy (required by CUDA kernel)
+ # # video_tokens might be int32 from precomputed features
+ # labels_flat = labels_flat.long()
+
+ # loss = F.cross_entropy(
+ # logits,
+ # labels_flat,
+ # ignore_index=-100,
+ # reduction="mean",
+ # )
+ loss = F.cross_entropy(
+ logits, # [B, vocab, F, H, W]
+ labels.long(), # [B, F, H, W]
+ ignore_index=-100,
+ )
+
+
+ # Gather the losses across all processes for logging (if we use distributed training).
+ avg_loss = accelerator.gather(loss.repeat(args.train_batch_size)).mean()
+ avg_masking_rate = accelerator.gather(mask_prob.repeat(args.train_batch_size)).mean()
+
+ accelerator.backward(loss)
+
+ if args.max_grad_norm is not None and accelerator.sync_gradients:
+ accelerator.clip_grad_norm_(model.parameters(), args.max_grad_norm)
+
+ optimizer.step()
+ lr_scheduler.step()
+
+ optimizer.zero_grad(set_to_none=True)
+
+ # Checks if the accelerator has performed an optimization step behind the scenes
+ if accelerator.sync_gradients:
+ # EMA not used for video training
+
+ if (global_step + 1) % args.logging_steps == 0:
+ logs = {
+ "step_loss": avg_loss.item(),
+ "lr": lr_scheduler.get_last_lr()[0],
+ "avg_masking_rate": avg_masking_rate.item(),
+ }
+ accelerator.log(logs, step=global_step + 1)
+
+ logger.info(
+ f"Step: {global_step + 1} "
+ f"Loss: {avg_loss.item():0.4f} "
+ f"LR: {lr_scheduler.get_last_lr()[0]:0.6f}"
+ )
+
+ if (global_step + 1) % args.checkpointing_steps == 0:
+ save_checkpoint(args, accelerator, global_step + 1, logger)
+
+ if (global_step + 1) % args.validation_steps == 0 and accelerator.is_main_process:
+ # EMA not used for video training
+
+ with torch.no_grad():
+ logger.info("Generating videos for validation...")
+
+ model.eval()
+
+ # Load text encoder and video tokenizer for validation if using precomputed features
+ # Use different variable names to avoid shadowing global variables
+ val_text_encoder = None
+ val_tokenizer = None
+ val_video_tokenizer = None
+
+ if args.use_precomputed_features:
+ logger.info("Loading text encoder and video tokenizer for validation...")
+
+ # Load text encoder
+ if args.text_encoder_architecture == "umt5-base":
+ model_id = "google/umt5-base"
+ elif args.text_encoder_architecture == "umt5-xxl":
+ model_id = "google/umt5-xxl"
+ elif args.text_encoder_architecture == "t5":
+ model_id = "t5-base"
+ else:
+ raise ValueError(f"Unknown text encoder architecture: {args.text_encoder_architecture}")
+
+ val_text_encoder = T5EncoderModel.from_pretrained(model_id, torch_dtype=weight_dtype)
+ val_tokenizer = T5Tokenizer.from_pretrained(model_id)
+
+ val_text_encoder.eval()
+ val_text_encoder.requires_grad_(False)
+ elif args.use_precomputed_video_only:
+ logger.info("Reusing already loaded text encoder for validation...")
+ # Reuse the text encoder and tokenizer already loaded for training
+ # Need to unwrap from accelerator wrapper first
+ val_text_encoder = safe_unwrap_model(text_encoder, accelerator)
+ val_text_encoder.eval()
+ val_text_encoder.requires_grad_(False)
+ val_tokenizer = tokenizer
+
+ # Load video tokenizer
+ val_video_tokenizer = CosmosVideoTokenizer(
+ model_id=args.video_tokenizer_model_id,
+ device=accelerator.device,
+ dtype=weight_dtype
+ )
+ val_video_tokenizer.requires_grad_(False)
+ val_video_tokenizer.eval()
+
+ logger.info("Text encoder and video tokenizer loaded for validation")
+ else:
+ # Use global variables when not using precomputed features
+ val_text_encoder = text_encoder
+ val_tokenizer = tokenizer
+ val_video_tokenizer = video_tokenizer
+ if args.train_text_encoder:
+ val_text_encoder.eval()
+
+ # Video pipeline validation
+ logger.info("Generating videos for validation...")
+
+ # For video, create scheduler with mask_token_id
+ scheduler = Scheduler(
+ mask_token_id=val_video_tokenizer.mask_token_id,
+ masking_schedule="cosine"
+ )
+ scheduler.set_timesteps(num_inference_steps=48, device=accelerator.device)
+
+ # Get unwrapped transformer and ensure it's on correct dtype
+ unwrapped_transformer = safe_unwrap_model(model, accelerator)
+ # Ensure transformer is on the correct dtype (text_embedding was randomly initialized as float32)
+ unwrapped_transformer = unwrapped_transformer.to(dtype=weight_dtype)
+
+ pipe = VideoPipeline(
+ tokenizer=val_tokenizer,
+ text_encoder=val_text_encoder,
+ transformer=unwrapped_transformer,
+ scheduler=scheduler,
+ video_tokenizer=val_video_tokenizer,
+ text_len=512,
+ num_frames=args.num_frames,
+ height=args.video_height,
+ width=args.video_width,
+ )
+
+ # Generate videos
+ try:
+ videos = pipe(
+ prompt=args.validation_prompts,
+ num_frames=args.num_frames,
+ height=args.video_height,
+ width=args.video_width,
+ guidance_scale=9.0,
+ num_inference_steps=48,
+ output_type="pil",
+ ).videos
+
+ # Log videos to wandb (save first frame of each video)
+ if is_wandb_available():
+ wandb_images = []
+ for i, video in enumerate(videos):
+ if isinstance(video, list) and len(video) > 0:
+ first_frame = video[0]
+ elif isinstance(video, torch.Tensor):
+ first_frame = transforms.ToPILImage()(video[:, 0, :, :].clamp(0, 1))
+ else:
+ first_frame = video
+ if first_frame is not None:
+ prompt_caption = args.validation_prompts[i] if i < len(args.validation_prompts) else f"video_{i}"
+ wandb_images.append(wandb.Image(first_frame, caption=prompt_caption))
+ if wandb_images:
+ wandb.log({"generated_videos_first_frame": wandb_images}, step=global_step + 1)
+
+ # Save video frames as grid
+ for i, video in enumerate(videos):
+ if isinstance(video, list):
+ frames = [transforms.ToTensor()(frame) for frame in video]
+ if frames:
+ frames_tensor = torch.stack(frames, dim=0)
+ grid = make_grid(frames_tensor, nrow=min(4, len(frames)))
+ grid_path = os.path.join(args.output_dir, f"{global_step}_video_{i}_CFG-9.png")
+ save_image(grid, grid_path)
+ if is_wandb_available():
+ wandb.log(
+ {"generated_videos_grid": wandb.Image(grid, caption=f"video_{i}_grid")},
+ step=global_step + 1,
+ )
+ elif isinstance(video, torch.Tensor):
+ C, num_frames_video, H, W = video.shape
+ frames_list = [video[:, f, :, :] for f in range(num_frames_video)]
+ frames_tensor = torch.stack(frames_list, dim=0)
+ grid = make_grid(frames_tensor, nrow=min(4, num_frames_video))
+ grid_path = os.path.join(args.output_dir, f"{global_step}_video_{i}_CFG-9.png")
+ save_image(grid, grid_path)
+ if is_wandb_available():
+ wandb.log(
+ {"generated_videos_grid": wandb.Image(grid, caption=f"video_{i}_grid")},
+ step=global_step + 1,
+ )
+
+ logger.info(f"Validation videos saved to {args.output_dir}")
+ except Exception as e:
+ logger.error(f"Video validation failed: {e}")
+ import traceback
+ traceback.print_exc()
+ finally:
+ # Clean up models loaded for validation (if using precomputed features)
+ if args.use_precomputed_features:
+ # Delete validation models to free GPU memory
+ if 'val_text_encoder' in locals():
+ del val_text_encoder
+ if 'val_tokenizer' in locals():
+ del val_tokenizer
+ if 'val_video_tokenizer' in locals():
+ del val_video_tokenizer
+ if 'pipe' in locals():
+ del pipe
+ if 'scheduler' in locals():
+ del scheduler
+ # Clear CUDA cache
+ torch.cuda.empty_cache()
+ logger.info("Cleaned up validation models and freed GPU memory")
+ elif args.use_precomputed_video_only:
+ # Don't delete text_encoder/tokenizer as they are reused from training
+ # Only clean up validation-specific models
+ if 'val_video_tokenizer' in locals():
+ del val_video_tokenizer
+ if 'pipe' in locals():
+ del pipe
+ if 'scheduler' in locals():
+ del scheduler
+ # Clear CUDA cache
+ torch.cuda.empty_cache()
+ logger.info("Cleaned up validation models (kept text_encoder for reuse)")
+
+ model.train()
+
+ if args.train_text_encoder and not args.use_precomputed_features and not args.use_precomputed_video_only:
+ # Only set train mode if text_encoder is still loaded (not using precomputed features)
+ if text_encoder is not None:
+ text_encoder.train()
+
+ # EMA not used for video training
+
+ global_step += 1
+
+ # Stop training if max steps is reached
+ if global_step >= args.max_train_steps:
+ break
+ # End for
+
+ accelerator.wait_for_everyone()
+
+ # Evaluate and save checkpoint at the end of training
+ save_checkpoint(args, accelerator, global_step, logger)
+
+ # Save the final trained checkpoint
+ if accelerator.is_main_process:
+ model = safe_unwrap_model(model, accelerator)
+ # EMA not used for video training
+ model.save_pretrained(args.output_dir)
+
+ accelerator.end_training()
+
+
+
+
+
+if __name__ == "__main__":
+ main(parse_args())
+
+
diff --git a/Meissonic/train/train_meissonic.py b/Meissonic/train/train_meissonic.py
new file mode 100644
index 0000000000000000000000000000000000000000..432ff946f0fcf1a020e417a12d6afa1a048374f2
--- /dev/null
+++ b/Meissonic/train/train_meissonic.py
@@ -0,0 +1,1085 @@
+# Copyright 2024 The HuggingFace Team and The MeissonFlow Team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import argparse
+import copy
+import logging
+import math
+import os
+from contextlib import nullcontext
+from pathlib import Path
+import sys
+sys.path.append(os.getcwd())
+import torch
+import torch.nn.functional as F
+from accelerate import Accelerator
+from accelerate.logging import get_logger
+from accelerate.utils import ProjectConfiguration, set_seed
+from peft import LoraConfig
+from peft.utils import get_peft_model_state_dict
+from torch.utils.data import DataLoader, default_collate
+from torchvision import transforms
+from transformers import (
+ CLIPTextModelWithProjection,
+ CLIPTokenizer,
+)
+import diffusers.optimization
+from diffusers import EMAModel, VQModel
+from src.scheduler import Scheduler
+from diffusers.loaders import LoraLoaderMixin
+from diffusers.utils import is_wandb_available
+from src.pipeline import Pipeline
+from torchvision.utils import save_image,make_grid
+from datasets import load_dataset
+from train.trainer_utils import save_checkpoint
+from train.dataset_utils import MyParquetDataset, HuggingFaceDataset
+from train.dataset_utils import tokenize_prompt, encode_prompt
+from src.transformer import Transformer2DModel
+
+if is_wandb_available():
+ import wandb
+ # wandb.login(key="")
+
+logger = get_logger(__name__, log_level="INFO")
+
+import torch._dynamo
+torch._dynamo.config.verbose = True
+
+# Optionally suppress errors to fall back to eager execution
+torch._dynamo.config.suppress_errors = True
+
+def parse_args():
+ parser = argparse.ArgumentParser()
+ parser.add_argument(
+ "--pretrained_model_architecture",
+ type=str,
+ default="Meissonic",
+ required=False
+ )
+ parser.add_argument(
+ "--text_encoder_architecture",
+ type=str,
+ default="open_clip",
+ required=False,
+ help="The architecture of the text encoder. One of ['CLIP', 'open_clip', 'flan-t5-base','Qwen2-0.5B','gemini-2b',long_CLIP_T5_base','CLIP_T5_base']",
+ )
+ parser.add_argument(
+ "--instance_dataset",
+ type=str,
+ default=None,
+ required=False,
+ help="The dataset to use for training. One of ['MSCOCO600K', 'PickaPicV2']",
+ )
+ parser.add_argument(
+ "--training_from_scratch",
+ type=bool,
+ default=False,
+ required=False
+ )
+ parser.add_argument(
+ "--pretrained_model_name_or_path",
+ type=str,
+ default=None,
+ required=True,
+ help="Path to pretrained model or model identifier from huggingface.co/models.",
+ )
+ parser.add_argument(
+ "--revision",
+ type=str,
+ default=None,
+ required=False,
+ help="Revision of pretrained model identifier from huggingface.co/models.",
+ )
+ parser.add_argument(
+ "--variant",
+ type=str,
+ default=None,
+ help="Variant of the model files of the pretrained model identifier from huggingface.co/models, 'e.g.' fp16",
+ )
+ parser.add_argument(
+ "--instance_data_dataset",
+ type=str,
+ default=None,
+ required=False,
+ help="A Hugging Face dataset containing the training images",
+ )
+ parser.add_argument(
+ "--instance_data_dir",
+ type=str,
+ default=None,
+ required=False,
+ help="A folder containing the training data of instance images.",
+ )
+ parser.add_argument(
+ "--instance_data_image", type=str, default=None, required=False, help="A single training image"
+ )
+ parser.add_argument(
+ "--use_8bit_adam", action="store_true", help="Whether or not to use 8-bit Adam from bitsandbytes."
+ )
+ parser.add_argument(
+ "--dataloader_num_workers",
+ type=int,
+ default=0,
+ help=(
+ "Number of subprocesses to use for data loading. 0 means that the data will be loaded in the main process."
+ ),
+ )
+ parser.add_argument(
+ "--allow_tf32",
+ action="store_true",
+ help=(
+ "Whether or not to allow TF32 on Ampere GPUs. Can be used to speed up training. For more information, see"
+ " https://pytorch.org/docs/stable/notes/cuda.html#tensorfloat-32-tf32-on-ampere-devices"
+ ),
+ )
+ parser.add_argument("--use_ema", action="store_true", help="Whether to use EMA model.")
+ parser.add_argument("--ema_decay", type=float, default=0.9999)
+ parser.add_argument("--ema_update_after_step", type=int, default=0)
+ parser.add_argument("--adam_beta1", type=float, default=0.9, help="The beta1 parameter for the Adam optimizer.")
+ parser.add_argument("--adam_beta2", type=float, default=0.999, help="The beta2 parameter for the Adam optimizer.")
+ parser.add_argument("--adam_weight_decay", type=float, default=1e-2, help="Weight decay to use.")
+ parser.add_argument("--adam_epsilon", type=float, default=1e-08, help="Epsilon value for the Adam optimizer")
+ parser.add_argument(
+ "--output_dir",
+ type=str,
+ default="muse_training",
+ help="The output directory where the model predictions and checkpoints will be written.",
+ )
+ parser.add_argument("--seed", type=int, default=None, help="A seed for reproducible training.")
+ parser.add_argument(
+ "--logging_dir",
+ type=str,
+ default="logs",
+ help=(
+ "[TensorBoard](https://www.tensorflow.org/tensorboard) log directory. Will default to"
+ " *output_dir/runs/**CURRENT_DATETIME_HOSTNAME***."
+ ),
+ )
+ parser.add_argument(
+ "--max_train_steps",
+ type=int,
+ default=None,
+ help="Total number of training steps to perform. If provided, overrides num_train_epochs.",
+ )
+ parser.add_argument(
+ "--checkpointing_steps",
+ type=int,
+ default=500,
+ help=(
+ "Save a checkpoint of the training state every X updates. Checkpoints can be used for resuming training via `--resume_from_checkpoint`. "
+ "In the case that the checkpoint is better than the final trained model, the checkpoint can also be used for inference."
+ "Using a checkpoint for inference requires separate loading of the original pipeline and the individual checkpointed model components."
+ "See https://huggingface.co/docs/diffusers/main/en/training/dreambooth#performing-inference-using-a-saved-checkpoint for step by step"
+ "instructions."
+ ),
+ )
+ parser.add_argument(
+ "--logging_steps",
+ type=int,
+ default=50,
+ )
+ parser.add_argument(
+ "--checkpoints_total_limit",
+ type=int,
+ default=None,
+ help=(
+ "Max number of checkpoints to store. Passed as `total_limit` to the `Accelerator` `ProjectConfiguration`."
+ " See Accelerator::save_state https://huggingface.co/docs/accelerate/package_reference/accelerator#accelerate.Accelerator.save_state"
+ " for more details"
+ ),
+ )
+ parser.add_argument(
+ "--resume_from_checkpoint",
+ type=str,
+ default=None,
+ help=(
+ "Whether training should be resumed from a previous checkpoint. Use a path saved by"
+ ' `--checkpointing_steps`, or `"latest"` to automatically select the last available checkpoint.'
+ ),
+ )
+ parser.add_argument(
+ "--train_batch_size", type=int, default=16, help="Batch size (per device) for the training dataloader."
+ )
+ parser.add_argument(
+ "--gradient_accumulation_steps",
+ type=int,
+ default=1,
+ help="Number of updates steps to accumulate before performing a backward/update pass.",
+ )
+ parser.add_argument(
+ "--learning_rate",
+ type=float,
+ default=0.0003,
+ help="Initial learning rate (after the potential warmup period) to use.",
+ )
+ parser.add_argument(
+ "--scale_lr",
+ action="store_true",
+ default=False,
+ help="Scale the learning rate by the number of GPUs, gradient accumulation steps, and batch size.",
+ )
+ parser.add_argument(
+ "--lr_scheduler",
+ type=str,
+ default="constant",
+ help=(
+ 'The scheduler type to use. Choose between ["linear", "cosine", "cosine_with_restarts", "polynomial",'
+ ' "constant", "constant_with_warmup"]'
+ ),
+ )
+ parser.add_argument(
+ "--lr_warmup_steps", type=int, default=500, help="Number of steps for the warmup in the lr scheduler."
+ )
+ parser.add_argument(
+ "--validation_steps",
+ type=int,
+ default=100,
+ help=(
+ "Run validation every X steps. Validation consists of running the prompt"
+ " `args.validation_prompt` multiple times: `args.num_validation_images`"
+ " and logging the images."
+ ),
+ )
+ parser.add_argument(
+ "--mixed_precision",
+ type=str,
+ default=None,
+ choices=["no", "fp16", "bf16"],
+ help=(
+ "Whether to use mixed precision. Choose between fp16 and bf16 (bfloat16). Bf16 requires PyTorch >="
+ " 1.10.and an Nvidia Ampere GPU. Default to the value of accelerate config of the current system or the"
+ " flag passed with the `accelerate.launch` command. Use this argument to override the accelerate config."
+ ),
+ )
+ parser.add_argument(
+ "--report_to",
+ type=str,
+ default="wandb",
+ help=(
+ 'The integration to report the results and logs to. Supported platforms are `"tensorboard"`'
+ ' (default), `"wandb"` and `"comet_ml"`. Use `"all"` to report to all integrations.'
+ ),
+ )
+ parser.add_argument("--validation_prompts", type=str, nargs="*")
+ parser.add_argument(
+ "--resolution",
+ type=int,
+ default=512,
+ help=(
+ "The resolution for input images, all the images in the train/validation dataset will be resized to this"
+ " resolution"
+ ),
+ )
+ parser.add_argument("--split_vae_encode", type=int, required=False, default=None)
+ parser.add_argument("--min_masking_rate", type=float, default=0.0)
+ parser.add_argument("--cond_dropout_prob", type=float, default=0.0)
+ parser.add_argument("--max_grad_norm", default=50.0, type=float, help="Max gradient norm.", required=False)
+ parser.add_argument("--use_lora", action="store_true", help="Fine tune the model using LoRa")
+ parser.add_argument("--text_encoder_use_lora", action="store_true", help="Fine tune the model using LoRa")
+ parser.add_argument("--lora_r", default=16, type=int)
+ parser.add_argument("--lora_alpha", default=32, type=int)
+ parser.add_argument("--lora_target_modules", default=["to_q", "to_k", "to_v"], type=str, nargs="+")
+ parser.add_argument("--text_encoder_lora_r", default=16, type=int)
+ parser.add_argument("--text_encoder_lora_alpha", default=32, type=int)
+ parser.add_argument("--text_encoder_lora_target_modules", default=["to_q", "to_k", "to_v"], type=str, nargs="+")
+ parser.add_argument("--train_text_encoder", action="store_true")
+ parser.add_argument("--image_key", type=str, required=False)
+ parser.add_argument("--prompt_key", type=str, required=False)
+ parser.add_argument(
+ "--gradient_checkpointing",
+ action="store_true",
+ help="Whether or not to use gradient checkpointing to save memory at the expense of slower backward pass.",
+ )
+ parser.add_argument("--prompt_prefix", type=str, required=False, default=None)
+
+ args = parser.parse_args()
+
+ if args.report_to == "wandb":
+ if not is_wandb_available():
+ raise ImportError("Make sure to install wandb if you want to use it for logging during training.")
+
+ num_datasources = sum(
+ [x is not None for x in [args.instance_data_dir, args.instance_data_image, args.instance_data_dataset]]
+ )
+
+ if num_datasources != 1:
+ raise ValueError(
+ "provide one and only one of `--instance_data_dir`, `--instance_data_image`, or `--instance_data_dataset`"
+ )
+
+ if args.instance_data_dir is not None:
+ if not os.path.exists(args.instance_data_dir):
+ raise ValueError(f"Does not exist: `--args.instance_data_dir` {args.instance_data_dir}")
+
+ if args.instance_data_image is not None:
+ if not os.path.exists(args.instance_data_image):
+ raise ValueError(f"Does not exist: `--args.instance_data_image` {args.instance_data_image}")
+
+ if args.instance_data_dataset is not None and (args.image_key is None or args.prompt_key is None):
+ raise ValueError("`--instance_data_dataset` requires setting `--image_key` and `--prompt_key`")
+
+ return args
+
+def _prepare_latent_image_ids(batch_size, height, width, device, dtype):
+ latent_image_ids = torch.zeros(height // 2, width // 2, 3)
+ latent_image_ids[..., 1] = latent_image_ids[..., 1] + torch.arange(height // 2)[:, None]
+ latent_image_ids[..., 2] = latent_image_ids[..., 2] + torch.arange(width // 2)[None, :]
+
+ latent_image_id_height, latent_image_id_width, latent_image_id_channels = latent_image_ids.shape
+
+ latent_image_ids = latent_image_ids.reshape(
+ latent_image_id_height * latent_image_id_width, latent_image_id_channels
+ )
+ # latent_image_ids = latent_image_ids.unsqueeze(0).repeat(batch_size, 1, 1)
+
+ return latent_image_ids.to(device=device, dtype=dtype)
+
+def main(args):
+ if args.allow_tf32:
+ torch.backends.cuda.matmul.allow_tf32 = True
+
+ # if args.pretrained_model_architecture == "Meissonic":
+ # from src.pipeline import Pipeline
+ # else:
+ # raise ValueError(f"Unknown model architecture: {args.pretrained_model_architecture}")
+
+
+ logging_dir = Path(args.output_dir, args.logging_dir)
+
+ accelerator_project_config = ProjectConfiguration(project_dir=args.output_dir, logging_dir=logging_dir)
+
+ accelerator = Accelerator(
+ gradient_accumulation_steps=args.gradient_accumulation_steps,
+ mixed_precision=args.mixed_precision,
+ log_with=args.report_to,
+ project_config=accelerator_project_config,
+ )
+
+ if accelerator.is_main_process:
+ os.makedirs(args.output_dir, exist_ok=True)
+
+ # Make one log on every process with the configuration for debugging.
+ logging.basicConfig(
+ format="%(asctime)s - %(levelname)s - %(name)s - %(message)s",
+ datefmt="%m/%d/%Y %H:%M:%S",
+ level=logging.INFO,
+ )
+ logger.info(accelerator.state, main_process_only=False)
+
+ if accelerator.is_main_process:
+ accelerator.init_trackers("meissonic", config=vars(copy.deepcopy(args)))
+
+ if args.seed is not None:
+ set_seed(args.seed)
+
+ if args.text_encoder_architecture == "open_clip":
+ if args.resume_from_checkpoint:
+ text_encoder = CLIPTextModelWithProjection.from_pretrained(
+ args.resume_from_checkpoint, subfolder="text_encoder", variant=args.variant
+ )
+ tokenizer = CLIPTokenizer.from_pretrained(
+ args.resume_from_checkpoint, subfolder="tokenizer", variant=args.variant
+ )
+ else:
+ text_encoder = CLIPTextModelWithProjection.from_pretrained(
+ args.pretrained_model_name_or_path, subfolder="text_encoder", variant=args.variant
+ )
+ tokenizer = CLIPTokenizer.from_pretrained(
+ args.pretrained_model_name_or_path, subfolder="tokenizer", variant=args.variant
+ )
+
+ # elif args.text_encoder_architecture == "CLIP_T5_base":
+ # text_encoder_clip = CLIPTextModelWithProjection.from_pretrained(
+ # args.pretrained_model_name_or_path, subfolder="text_encoder", variant=args.variant
+ # )
+ # tokenizer_clip = CLIPTokenizer.from_pretrained(
+ # args.pretrained_model_name_or_path, subfolder="tokenizer", variant=args.variant
+ # )
+ # from transformers import T5Tokenizer, T5ForConditionalGeneration
+ # text_encoder_t5 = T5ForConditionalGeneration.from_pretrained("google/flan-t5-base",torch_dtype=torch.float16)
+ # tokenizer_t5 = T5Tokenizer.from_pretrained("google/flan-t5-base",torch_dtype=torch.float16)
+ # text_encoder = [text_encoder_clip,text_encoder_t5]
+ # tokenizer = [tokenizer_clip,tokenizer_t5]
+ # elif args.text_encoder_architecture == "flan-t5-base":
+ # from transformers import T5Tokenizer, T5ForConditionalGeneration
+ # text_encoder = T5ForConditionalGeneration.from_pretrained("google/flan-t5-base",torch_dtype=torch.float16)
+ # tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-base",torch_dtype=torch.float16)
+ # elif args.text_encoder_architecture == "gemini-2b":
+ # raise NotImplementedError("Gemini-2b is not yet supported")
+ # elif args.text_encoder_architecture == "Qwen2-0.5B":
+ # raise NotImplementedError("Qwen2-0.5B is not yet supported")
+ else:
+ raise ValueError(f"Unknown text encoder architecture: {args.text_encoder_architecture}")
+
+ vq_model = VQModel.from_pretrained(
+ args.pretrained_model_name_or_path, subfolder="vqvae", revision=args.revision, variant=args.variant
+ )
+
+ if args.train_text_encoder:
+ if args.text_encoder_use_lora:
+ lora_config = LoraConfig(
+ r=args.text_encoder_lora_r,
+ lora_alpha=args.text_encoder_lora_alpha,
+ target_modules=args.text_encoder_lora_target_modules,
+ )
+ text_encoder.add_adapter(lora_config)
+ if args.text_encoder_architecture == "CLIP_T5_base": # Not support yet. Only support open_clip
+ text_encoder[0].train()
+ text_encoder[0].requires_grad_(True)
+ text_encoder[1].train()
+ text_encoder[1].requires_grad_(True)
+ else:
+ text_encoder.train()
+ text_encoder.requires_grad_(True)
+ else:
+ if args.text_encoder_architecture == "CLIP_T5_base": # Not support yet. Only support open_clip
+ text_encoder[0].eval()
+ text_encoder[0].requires_grad_(False)
+ text_encoder[1].eval()
+ text_encoder[1].requires_grad_(False)
+ else:
+ text_encoder.eval()
+ text_encoder.requires_grad_(False)
+
+ vq_model.requires_grad_(False)
+
+ if args.pretrained_model_architecture == "Meissonic":
+ if args.training_from_scratch:
+ model = Transformer2DModel(
+ patch_size = 1,
+ in_channels = 64,
+ num_layers = 14,
+ num_single_layers = 28,
+ attention_head_dim = 128,
+ num_attention_heads = 8,
+ joint_attention_dim = 1024,
+ pooled_projection_dim = 1024,
+ guidance_embeds = False,
+ axes_dims_rope = (16, 56, 56),
+ downsample= True,
+ upsample= True,
+ )
+
+ # model_tmp = Transformer2DModel.from_pretrained("LAST_STAGE_CKPT_PATH", low_cpu_mem_usage=False, device_map=None)
+ # model.load_state_dict(model_tmp.state_dict(), strict=False)
+ # del model_tmp
+ else:
+ model = Transformer2DModel.from_pretrained(args.pretrained_model_name_or_path, subfolder="transformer", low_cpu_mem_usage=False, device_map=None)
+ else:
+ raise ValueError(f"Unknown model architecture: {args.pretrained_model_architecture}")
+
+ model = torch.compile(model)
+
+ if args.use_lora:
+ lora_config = LoraConfig(
+ r=args.lora_r,
+ lora_alpha=args.lora_alpha,
+ target_modules=args.lora_target_modules,
+ )
+ model.add_adapter(lora_config)
+
+ model.train()
+
+ if args.gradient_checkpointing:
+ model.enable_gradient_checkpointing()
+ if args.train_text_encoder:
+ if args.text_encoder_architecture == "CLIP_T5_base": # Not support yet. Only support open_clip
+ text_encoder[0].gradient_checkpointing_enable()
+ text_encoder[1].gradient_checkpointing_enable()
+ else:
+ text_encoder.gradient_checkpointing_enable()
+
+ if args.use_ema: # Not verify the robostness of this part
+ ema = EMAModel(
+ model.parameters(),
+ decay=args.ema_decay,
+ update_after_step=args.ema_update_after_step,
+ model_cls= Transformer2DModel,
+ model_config=model.config,
+ )
+
+ def save_model_hook(models, weights, output_dir):
+ if accelerator.is_main_process:
+ transformer_lora_layers_to_save = None
+ text_encoder_lora_layers_to_save = None
+
+ for model_ in models:
+ if isinstance(model_, type(accelerator.unwrap_model(model))):
+ if args.use_lora:
+ transformer_lora_layers_to_save = get_peft_model_state_dict(model_)
+ else:
+ model_.save_pretrained(os.path.join(output_dir, "transformer"))
+ elif isinstance(model_, type(accelerator.unwrap_model(text_encoder))):
+ if args.text_encoder_use_lora:
+ text_encoder_lora_layers_to_save = get_peft_model_state_dict(model_)
+ else:
+ model_.save_pretrained(os.path.join(output_dir, "text_encoder"))
+ else:
+ raise ValueError(f"unexpected save model: {model_.__class__}")
+
+ # make sure to pop weight so that corresponding model is not saved again
+ weights.pop()
+
+ if transformer_lora_layers_to_save is not None or text_encoder_lora_layers_to_save is not None:
+ LoraLoaderMixin.save_lora_weights(
+ output_dir,
+ unet_lora_layers=transformer_lora_layers_to_save,
+ text_encoder_lora_layers=text_encoder_lora_layers_to_save,
+ )
+
+ if args.use_ema:
+ ema.save_pretrained(os.path.join(output_dir, "ema_model"))
+
+ def load_model_hook(models, input_dir):
+ transformer = None
+ text_encoder_ = None
+
+ # this part is added for keep consistency when add model.compile() in the model
+ def adap_compile(ori_dict):#add '_orig_mod.' to each key
+ new_dict = {}
+ for k,v in ori_dict.items():
+ new_dict['_orig_mod.'+k] = v
+ return new_dict
+
+ while len(models) > 0:
+ model_ = models.pop()
+
+ if isinstance(model_, type(accelerator.unwrap_model(model))):
+ if args.use_lora:
+ transformer = model_
+ else:
+ if args.pretrained_model_architecture == "Meissonic":
+ load_model = Transformer2DModel.from_pretrained(os.path.join(input_dir, "transformer"), low_cpu_mem_usage=False, device_map=None)
+ else:
+ raise ValueError(f"Unknown model architecture: {args.pretrained_model_architecture}")
+ model_.load_state_dict(adap_compile(load_model.state_dict()))
+ del load_model
+ elif isinstance(model_, type(accelerator.unwrap_model(text_encoder))):
+ if args.text_encoder_use_lora:
+ text_encoder_ = model_
+ else:
+ try:
+ load_model = CLIPTextModelWithProjection.from_pretrained(os.path.join(input_dir, "text_encoder"))
+ model_.load_state_dict(load_model.state_dict())
+ # print('finished loading text encoder!')
+ except:
+ print('Not found text-encoder model in current folder. So we download one text encoder from Internet.')
+ load_model = CLIPTextModelWithProjection.from_pretrained("laion/CLIP-ViT-H-14-laion2B-s32B-b79K")
+ model_.load_state_dict(load_model.state_dict())
+ del load_model
+ else:
+ raise ValueError(f"unexpected save model: {model.__class__}")
+
+ if transformer is not None or text_encoder_ is not None:
+ lora_state_dict, network_alphas = LoraLoaderMixin.lora_state_dict(input_dir)
+ LoraLoaderMixin.load_lora_into_text_encoder(
+ lora_state_dict, network_alphas=network_alphas, text_encoder=text_encoder_
+ )
+ LoraLoaderMixin.load_lora_into_transformer(
+ lora_state_dict, network_alphas=network_alphas, transformer=transformer
+ )
+
+ if args.use_ema:
+ load_from = EMAModel.from_pretrained(os.path.join(input_dir, "ema_model"), model_cls=Transformer2DModel)
+ ema.load_state_dict(adap_compile(load_from.state_dict()))
+ del load_from
+
+ accelerator.register_load_state_pre_hook(load_model_hook)
+ accelerator.register_save_state_pre_hook(save_model_hook)
+
+ if args.scale_lr:
+ args.learning_rate = (
+ args.learning_rate * args.train_batch_size * accelerator.num_processes * args.gradient_accumulation_steps
+ )
+
+ if args.use_8bit_adam:
+ try:
+ import bitsandbytes as bnb
+ except ImportError:
+ raise ImportError(
+ "Please install bitsandbytes to use 8-bit Adam. You can do so by running `pip install bitsandbytes`"
+ )
+
+ optimizer_cls = bnb.optim.AdamW8bit
+ else:
+ optimizer_cls = torch.optim.AdamW
+
+ # no decay on bias and layernorm and embedding
+ no_decay = ["bias", "layer_norm.weight", "mlm_ln.weight", "embeddings.weight"]
+ optimizer_grouped_parameters = [
+ {
+ "params": [p for n, p in model.named_parameters() if not any(nd in n for nd in no_decay)],
+ "weight_decay": args.adam_weight_decay,
+ },
+ {
+ "params": [p for n, p in model.named_parameters() if any(nd in n for nd in no_decay)],
+ "weight_decay": 0.0,
+ },
+ ]
+
+ if args.train_text_encoder:
+ if args.text_encoder_architecture == "CLIP_T5_base": # Not support yet. Only support open_clip
+ optimizer_grouped_parameters.append(
+ {"params": text_encoder[0].parameters(), "weight_decay": args.adam_weight_decay}
+ )
+ optimizer_grouped_parameters.append(
+ {"params": text_encoder[1].parameters(), "weight_decay": args.adam_weight_decay}
+ )
+ else:
+ optimizer_grouped_parameters.append(
+ {"params": text_encoder.parameters(), "weight_decay": args.adam_weight_decay}
+ )
+
+ optimizer = optimizer_cls(
+ optimizer_grouped_parameters,
+ lr=args.learning_rate,
+ betas=(args.adam_beta1, args.adam_beta2),
+ weight_decay=args.adam_weight_decay,
+ eps=args.adam_epsilon,
+ )
+
+ logger.info("Creating dataloaders and lr_scheduler")
+
+ total_batch_size = args.train_batch_size * accelerator.num_processes * args.gradient_accumulation_steps
+
+ if args.instance_dataset == "MyParquetDataset":
+ dataset = MyParquetDataset(
+ root_dir=args.instance_data_dir, # something like '../parquets_father_dir/'
+ tokenizer=tokenizer,
+ size=args.resolution,
+ text_encoder_architecture=args.text_encoder_architecture
+ )
+ elif args.instance_dataset == 'HuggingFaceDataset': # you can try this first, just download dataset from huggingface
+ dataset = HuggingFaceDataset(
+ hf_dataset=load_dataset(args.instance_data_dir, split="train"), # something like '../parquets_father_dir/'
+ tokenizer=tokenizer,
+ image_key='image',
+ prompt_key='caption',
+ prompt_prefix=args.prompt_prefix,
+ size=args.resolution,
+ text_encoder_architecture=args.text_encoder_architecture
+ )
+ elif args.instance_dataset == "DATA_TYPE":
+ raise NotImplementedError("DATA_TYPE is not yet supported")
+ # Some instructions
+ # Origanize your text-image pairs in the following way:
+ # when apply __getitem__ method, return a dictionary with keys 'image', 'micro_conds' and 'prompt_input_ids'
+ # For more details to follow, please refer to the implementation of MyParquetDataset class
+ else:
+ assert False
+
+ train_dataloader = DataLoader(
+ dataset,
+ batch_size=args.train_batch_size,
+ shuffle=True,
+ num_workers=args.dataloader_num_workers,
+ collate_fn=default_collate,
+ pin_memory=True,
+ )
+ train_dataloader.num_batches = len(train_dataloader)
+
+ lr_scheduler = diffusers.optimization.get_scheduler(
+ args.lr_scheduler,
+ optimizer=optimizer,
+ num_training_steps=args.max_train_steps * accelerator.num_processes,
+ num_warmup_steps=args.lr_warmup_steps * accelerator.num_processes,
+ )
+
+ logger.info("Preparing model, optimizer and dataloaders")
+
+ if args.train_text_encoder:
+ if args.text_encoder_architecture == "CLIP_T5_base": # Not support yet. Only support open_clip
+ model, optimizer, lr_scheduler, train_dataloader, text_encoder[0], text_encoder[1] = accelerator.prepare(
+ model, optimizer, lr_scheduler, train_dataloader, text_encoder[0], text_encoder[1]
+ )
+ else:
+ model, optimizer, lr_scheduler, train_dataloader, text_encoder = accelerator.prepare(
+ model, optimizer, lr_scheduler, train_dataloader, text_encoder
+ )
+ else:
+ model, optimizer, lr_scheduler, train_dataloader = accelerator.prepare(
+ model, optimizer, lr_scheduler, train_dataloader
+ )
+
+ train_dataloader.num_batches = len(train_dataloader)
+
+ weight_dtype = torch.float32
+ if accelerator.mixed_precision == "fp16":
+ weight_dtype = torch.float16
+ elif accelerator.mixed_precision == "bf16":
+ weight_dtype = torch.bfloat16
+
+ if not args.train_text_encoder:
+ if args.text_encoder_architecture == "CLIP_T5_base": # Not support yet. Only support open_clip
+ text_encoder[0].to(device=accelerator.device, dtype=weight_dtype)
+ text_encoder[1].to(device=accelerator.device, dtype=weight_dtype)
+ else:
+ text_encoder.to(device=accelerator.device, dtype=weight_dtype)
+
+ vq_model.to(device=accelerator.device)
+
+ if args.use_ema:
+ ema.to(accelerator.device)
+
+ with nullcontext() if args.train_text_encoder else torch.no_grad():
+ if args.text_encoder_architecture == "CLIP_T5_base": # Not support yet. Only support open_clip
+ _input_ids_tmp_ = tokenize_prompt(tokenizer, "", args.text_encoder_architecture)
+ _input_ids_tmp_[0] = _input_ids_tmp_[0].to(accelerator.device, non_blocking=True)
+ _input_ids_tmp_[1] = _input_ids_tmp_[1].to(accelerator.device, non_blocking=True)
+ empty_embeds, empty_clip_embeds = encode_prompt(
+ text_encoder, _input_ids_tmp_, args.text_encoder_architecture
+ )
+ else:
+ empty_embeds, empty_clip_embeds = encode_prompt(
+ text_encoder, tokenize_prompt(tokenizer, "", args.text_encoder_architecture).to(accelerator.device, non_blocking=True), args.text_encoder_architecture
+ )
+
+ # There is a single image, we can just pre-encode the single prompt
+ if args.instance_data_image is not None:
+ prompt = os.path.splitext(os.path.basename(args.instance_data_image))[0]
+ if args.text_encoder_architecture == "CLIP_T5_base": # Not support yet. Only support open_clip
+ _input_ids_tmp_ = tokenize_prompt(tokenizer, prompt, args.text_encoder_architecture)
+ _input_ids_tmp_[0] = _input_ids_tmp_[0].to(accelerator.device, non_blocking=True)
+ _input_ids_tmp_[1] = _input_ids_tmp_[1].to(accelerator.device, non_blocking=True)
+ empty_embeds, empty_clip_embeds = encode_prompt(
+ text_encoder, _input_ids_tmp_, args.text_encoder_architecture
+ )
+ else:
+ encoder_hidden_states, cond_embeds = encode_prompt(
+ text_encoder, tokenize_prompt(tokenizer, prompt, args.text_encoder_architecture).to(accelerator.device, non_blocking=True), args.text_encoder_architecture
+ )
+ encoder_hidden_states = encoder_hidden_states.repeat(args.train_batch_size, 1, 1)
+ cond_embeds = cond_embeds.repeat(args.train_batch_size, 1)
+
+ # We need to recalculate our total training steps as the size of the training dataloader may have changed.
+ num_update_steps_per_epoch = math.ceil(train_dataloader.num_batches / args.gradient_accumulation_steps)
+ # Afterwards we recalculate our number of training epochs.
+ # Note: We are not doing epoch based training here, but just using this for book keeping and being able to
+ # reuse the same training loop with other datasets/loaders.
+ num_train_epochs = math.ceil(args.max_train_steps / num_update_steps_per_epoch)
+
+ # Train!
+ logger.info("***** Running training *****")
+ logger.info(f" Num training steps = {args.max_train_steps}")
+ logger.info(f" Instantaneous batch size per device = { args.train_batch_size}")
+ logger.info(f" Total train batch size (w. parallel, distributed & accumulation) = {total_batch_size}")
+ logger.info(f" Gradient Accumulation steps = {args.gradient_accumulation_steps}")
+
+ resume_from_checkpoint = args.resume_from_checkpoint
+ if resume_from_checkpoint:
+ if resume_from_checkpoint == "latest":
+ # Get the most recent checkpoint
+ dirs = os.listdir(args.output_dir)
+ dirs = [d for d in dirs if d.startswith("checkpoint")]
+ dirs = sorted(dirs, key=lambda x: int(x.split("-")[1]))
+ if len(dirs) > 0:
+ resume_from_checkpoint = os.path.join(args.output_dir, dirs[-1])
+ else:
+ resume_from_checkpoint = None
+
+ if resume_from_checkpoint is None:
+ accelerator.print(
+ f"Checkpoint '{args.resume_from_checkpoint}' does not exist. Starting a new training run."
+ )
+ else:
+ accelerator.print(f"Resuming from checkpoint {resume_from_checkpoint}")
+
+ if resume_from_checkpoint is None:
+ global_step = 0
+ first_epoch = 0
+ else:
+ accelerator.load_state(resume_from_checkpoint)
+ global_step = int(os.path.basename(resume_from_checkpoint).split("-")[1])
+ first_epoch = global_step // num_update_steps_per_epoch
+
+ # This is to solve the inconsistent tensor device issue
+ if args.use_ema:
+ ema.shadow_params = [p.to(accelerator.device) for p in ema.shadow_params]
+
+ # As stated above, we are not doing epoch based training here, but just using this for book keeping and being able to
+ # reuse the same training loop with other datasets/loaders.
+ for epoch in range(first_epoch, num_train_epochs):
+ for batch in train_dataloader:
+ torch.cuda.empty_cache()
+ with torch.no_grad():
+ micro_conds = batch["micro_conds"].to(accelerator.device, non_blocking=True)
+ pixel_values = batch["image"].to(accelerator.device, non_blocking=True)
+
+ batch_size = pixel_values.shape[0]
+
+ split_batch_size = args.split_vae_encode if args.split_vae_encode is not None else batch_size
+ num_splits = math.ceil(batch_size / split_batch_size)
+ image_tokens = []
+ for i in range(num_splits):
+ start_idx = i * split_batch_size
+ end_idx = min((i + 1) * split_batch_size, batch_size)
+ bs = pixel_values.shape[0]
+ image_tokens.append(
+ vq_model.quantize(vq_model.encode(pixel_values[start_idx:end_idx]).latents)[2][2].reshape(
+ split_batch_size, -1
+ )
+ )
+ image_tokens = torch.cat(image_tokens, dim=0)
+
+ batch_size, seq_len = image_tokens.shape
+
+ timesteps = torch.rand(batch_size, device=image_tokens.device)
+ mask_prob = torch.cos(timesteps * math.pi * 0.5)
+ mask_prob = mask_prob.clip(args.min_masking_rate)
+
+ num_token_masked = (seq_len * mask_prob).round().clamp(min=1)
+ batch_randperm = torch.rand(batch_size, seq_len, device=image_tokens.device).argsort(dim=-1)
+ mask = batch_randperm < num_token_masked.unsqueeze(-1)
+
+ mask_id = accelerator.unwrap_model(model).config.vocab_size - 1
+ input_ids = torch.where(mask, mask_id, image_tokens)
+ labels = torch.where(mask, image_tokens, -100)
+
+ if "prompt_input_ids" in batch:
+ with nullcontext() if args.train_text_encoder else torch.no_grad():
+ if args.text_encoder_architecture == "CLIP_T5_base": # Not support yet. Only support open_clip
+ batch["prompt_input_ids"][0] = batch["prompt_input_ids"][0].to(accelerator.device, non_blocking=True)
+ batch["prompt_input_ids"][1] = batch["prompt_input_ids"][1].to(accelerator.device, non_blocking=True)
+ encoder_hidden_states, cond_embeds = encode_prompt(
+ text_encoder, batch["prompt_input_ids"], args.text_encoder_architecture
+ )
+ else:
+ encoder_hidden_states, cond_embeds = encode_prompt(
+ text_encoder, batch["prompt_input_ids"].to(accelerator.device, non_blocking=True), args.text_encoder_architecture
+ )
+
+ if args.cond_dropout_prob > 0.0:
+ assert encoder_hidden_states is not None
+
+ batch_size = encoder_hidden_states.shape[0]
+
+ mask = (
+ torch.zeros((batch_size, 1, 1), device=encoder_hidden_states.device).float().uniform_(0, 1)
+ < args.cond_dropout_prob
+ )
+
+ empty_embeds_ = empty_embeds.expand(batch_size, -1, -1)
+ encoder_hidden_states = torch.where(
+ (encoder_hidden_states * mask).bool(), encoder_hidden_states, empty_embeds_
+ )
+
+ empty_clip_embeds_ = empty_clip_embeds.expand(batch_size, -1)
+ cond_embeds = torch.where((cond_embeds * mask.squeeze(-1)).bool(), cond_embeds, empty_clip_embeds_)
+
+ bs = input_ids.shape[0]
+ vae_scale_factor = 2 ** (len(vq_model.config.block_out_channels) - 1)
+ resolution = args.resolution // vae_scale_factor
+ input_ids = input_ids.reshape(bs, resolution, resolution)
+
+ if "prompt_input_ids" in batch:
+ with nullcontext() if args.train_text_encoder else torch.no_grad():
+ if args.text_encoder_architecture == "CLIP_T5_base": # Not support yet. Only support open_clip
+ batch["prompt_input_ids"][0] = batch["prompt_input_ids"][0].to(accelerator.device, non_blocking=True)
+ batch["prompt_input_ids"][1] = batch["prompt_input_ids"][1].to(accelerator.device, non_blocking=True)
+ encoder_hidden_states, cond_embeds = encode_prompt(
+ text_encoder, batch["prompt_input_ids"],args.text_encoder_architecture
+ )
+ else:
+ encoder_hidden_states, cond_embeds = encode_prompt(
+ text_encoder, batch["prompt_input_ids"].to(accelerator.device, non_blocking=True),args.text_encoder_architecture
+ )
+
+ # Train Step
+ with accelerator.accumulate(model):
+ codebook_size = accelerator.unwrap_model(model).config.codebook_size
+
+ if args.pretrained_model_architecture == 'Meissonic':
+
+ if args.resolution == 1024: # only stage 3 and stage 4 do not apply 2*
+ img_ids = _prepare_latent_image_ids(input_ids.shape[0], input_ids.shape[-2],input_ids.shape[-1],input_ids.device,input_ids.dtype)
+ else:
+ img_ids = _prepare_latent_image_ids(input_ids.shape[0],2*input_ids.shape[-2],2*input_ids.shape[-1],input_ids.device,input_ids.dtype)
+
+ txt_ids = torch.zeros(encoder_hidden_states.shape[1],3).to(device = input_ids.device, dtype = input_ids.dtype)
+
+ logits = (
+ model(
+ hidden_states=input_ids, # should be (batch size, channel, height, width)
+ encoder_hidden_states=encoder_hidden_states, # should be (batch size, sequence_len, embed_dims)
+ micro_conds=micro_conds, #
+ pooled_projections=cond_embeds, # should be (batch_size, projection_dim)
+ img_ids = img_ids,
+ txt_ids = txt_ids,
+ # timestep = timesteps * 20,
+ timestep = mask_prob * 1000,
+ # guidance = 9,
+ )
+ .reshape(bs, codebook_size, -1)
+ .permute(0, 2, 1)
+ .reshape(-1, codebook_size)
+ )
+ else:
+ raise ValueError(f"Unknown model architecture: {args.pretrained_model_architecture}")
+
+ loss = F.cross_entropy(
+ logits,
+ labels.view(-1),
+ ignore_index=-100,
+ reduction="mean",
+ )
+
+ # Gather the losses across all processes for logging (if we use distributed training).
+ avg_loss = accelerator.gather(loss.repeat(args.train_batch_size)).mean()
+ avg_masking_rate = accelerator.gather(mask_prob.repeat(args.train_batch_size)).mean()
+
+ accelerator.backward(loss)
+
+ if args.max_grad_norm is not None and accelerator.sync_gradients:
+ accelerator.clip_grad_norm_(model.parameters(), args.max_grad_norm)
+
+ optimizer.step()
+ lr_scheduler.step()
+
+ optimizer.zero_grad(set_to_none=True)
+
+ # Checks if the accelerator has performed an optimization step behind the scenes
+ if accelerator.sync_gradients:
+ if args.use_ema:
+ ema.step(model.parameters())
+
+ if (global_step + 1) % args.logging_steps == 0:
+ logs = {
+ "step_loss": avg_loss.item(),
+ "lr": lr_scheduler.get_last_lr()[0],
+ "avg_masking_rate": avg_masking_rate.item(),
+ }
+ accelerator.log(logs, step=global_step + 1)
+
+ logger.info(
+ f"Step: {global_step + 1} "
+ f"Loss: {avg_loss.item():0.4f} "
+ f"LR: {lr_scheduler.get_last_lr()[0]:0.6f}"
+ )
+
+ if (global_step + 1) % args.checkpointing_steps == 0:
+ save_checkpoint(args, accelerator, global_step + 1, logger)
+
+ if (global_step + 1) % args.validation_steps == 0 and accelerator.is_main_process:
+ if args.use_ema:
+ ema.store(model.parameters())
+ ema.copy_to(model.parameters())
+
+ with torch.no_grad():
+ logger.info("Generating images...")
+
+ model.eval()
+
+ if args.train_text_encoder:
+ text_encoder.eval()
+
+ scheduler = Scheduler.from_pretrained(
+ args.pretrained_model_name_or_path,
+ subfolder="scheduler",
+ revision=args.revision,
+ variant=args.variant,
+ )
+ if args.text_encoder_architecture == "CLIP" or args.text_encoder_architecture == "open_clip":
+ pipe = Pipeline(
+ transformer=accelerator.unwrap_model(model),
+ tokenizer=tokenizer,
+ text_encoder=text_encoder,
+ vqvae=vq_model,
+ scheduler=scheduler,
+ )
+ else:
+ pipe = Pipeline(
+ transformer=accelerator.unwrap_model(model),
+ tokenizer=tokenizer[0],
+ text_encoder=text_encoder[0],
+ vqvae=vq_model,
+ scheduler=scheduler,
+ text_encoder_t5=text_encoder[1],
+ tokenizer_t5=tokenizer[1]
+ )
+
+
+
+
+
+ pil_images = pipe(prompt=args.validation_prompts,height=args.resolution,width=args.resolution,guidance_scale=9,num_inference_steps=64).images
+ wandb_images = [
+ wandb.Image(image, caption=args.validation_prompts[i])
+ for i, image in enumerate(pil_images)
+ ]
+
+ wandb.log({"generated_images": wandb_images}, step=global_step + 1)
+
+ result=[]
+ for img in pil_images:
+ if not isinstance(img, torch.Tensor):
+ img = transforms.ToTensor()(img)
+ result.append(img.unsqueeze(0))
+ result = torch.cat(result,dim=0)
+ result = make_grid(result, nrow=3)
+ save_image(result,os.path.join(args.output_dir,str(global_step)+'_text2image_1024_CFG-9.png'))
+
+
+ # pil_images = pipe(prompt=args.validation_prompts,height=args.resolution,width=args.resolution,guidance_scale=9).images
+ # result=[]
+ # for img in pil_images:
+ # if not isinstance(img, torch.Tensor):
+ # img = transforms.ToTensor()(img)
+ # result.append(img.unsqueeze(0))
+ # result = torch.cat(result,dim=0)
+ # result = make_grid(result, nrow=3)
+ # save_image(result,os.path.join(args.output_dir,str(global_step)+'_text2image_1024_CFG-9.png'))
+
+
+
+ model.train()
+
+ if args.train_text_encoder:
+ if args.text_encoder_architecture == "CLIP_T5_base": # Not support yet. Only support open_clip
+ text_encoder[0].train()
+ text_encoder[1].trian()
+ else:
+ text_encoder.train()
+
+ if args.use_ema:
+ ema.restore(model.parameters())
+
+ global_step += 1
+
+ # Stop training if max steps is reached
+ if global_step >= args.max_train_steps:
+ break
+ # End for
+
+ accelerator.wait_for_everyone()
+
+ # Evaluate and save checkpoint at the end of training
+ save_checkpoint(args, accelerator, global_step, logger)
+
+ # Save the final trained checkpoint
+ if accelerator.is_main_process:
+ model = accelerator.unwrap_model(model)
+ if args.use_ema:
+ ema.copy_to(model.parameters())
+ model.save_pretrained(args.output_dir)
+
+ accelerator.end_training()
+
+
+
+
+
+if __name__ == "__main__":
+ main(parse_args())
+
+
diff --git a/Meissonic/train/train_overfit.py b/Meissonic/train/train_overfit.py
new file mode 100644
index 0000000000000000000000000000000000000000..386a82b4cefcf1e5c1742d44cba6efd3e5b444b9
--- /dev/null
+++ b/Meissonic/train/train_overfit.py
@@ -0,0 +1,624 @@
+#!/usr/bin/env python3
+"""
+Overfitting experiment script to verify implementation correctness.
+
+This script trains on a tiny subset (128-256 videos) with:
+- High learning rate (5e-4 to 1e-3)
+- Simple constant/warmup scheduler
+- Small batch size (4-8)
+- Fixed seed for reproducibility
+- 2k-5k steps
+
+Expected behavior if implementation is correct:
+- Loss should drop to 5-6 or even lower (0.x)
+- Loss should continue decreasing, indicating ability to overfit
+
+If loss stays high (9-10) or diverges, there's likely a bug in:
+- mask_token logic
+- scheduler
+- label alignment
+- logits reshaping
+"""
+
+import argparse
+import logging
+import os
+import sys
+sys.path.append(os.getcwd())
+
+import torch
+import torch.nn.functional as F
+from accelerate import Accelerator
+from accelerate.logging import get_logger
+from accelerate.utils import set_seed
+from torch.utils.data import DataLoader, default_collate
+from transformers import T5Tokenizer, T5EncoderModel
+
+from src.scheduler_video import Scheduler
+from src.pipeline_video import CosmosVideoTokenizer, Pipeline as VideoPipeline
+from src.transformer_video import WanDiscreteVideoTransformer, WanModel
+from train.dataset_utils import TinyOpenVid1MDataset, tokenize_prompt, encode_prompt
+from torchvision import transforms
+from torchvision.utils import save_image, make_grid
+
+logger = get_logger(__name__, log_level="INFO")
+
+
+def parse_args():
+ parser = argparse.ArgumentParser(description="Overfitting experiment for video diffusion model")
+
+ # Model args
+ parser.add_argument("--text_encoder_architecture", type=str, default="umt5-base")
+ parser.add_argument("--video_tokenizer_model_id", type=str, default="Cosmos-1.0-Tokenizer-DV8x16x16")
+ parser.add_argument("--wan_pretrained_path", type=str, default=None, help="Path to pretrained Wan weights")
+
+ # Dataset args
+ parser.add_argument("--instance_data_dir", type=str, required=True, help="Path to OpenVid1M CSV file")
+ parser.add_argument("--max_samples", type=int, default=256, help="Number of samples for overfitting (128-256)")
+ parser.add_argument("--num_frames", type=int, default=16)
+ parser.add_argument("--video_height", type=int, default=480)
+ parser.add_argument("--video_width", type=int, default=848)
+
+ # Training args
+ parser.add_argument("--train_batch_size", type=int, default=4, help="Batch size (4-8 for overfitting)")
+ parser.add_argument("--learning_rate", type=float, default=5e-4, help="High LR for overfitting (5e-4 to 1e-3)")
+ parser.add_argument("--max_train_steps", type=int, default=3000, help="Steps for overfitting (2k-5k)")
+ parser.add_argument("--gradient_accumulation_steps", type=int, default=1)
+ parser.add_argument("--lr_warmup_steps", type=int, default=100, help="Small warmup for overfitting")
+ parser.add_argument("--gradient_checkpointing", action="store_true")
+ parser.add_argument("--mixed_precision", type=str, default="bf16", choices=["no", "fp16", "bf16"])
+
+ # Other args
+ parser.add_argument("--seed", type=int, default=42, help="Fixed seed for reproducibility")
+ parser.add_argument("--output_dir", type=str, default="./output_overfit")
+ parser.add_argument("--logging_steps", type=int, default=50)
+ parser.add_argument("--save_steps", type=int, default=500)
+ parser.add_argument("--inference_steps", type=int, default=500, help="Steps interval for inference (default: 500)")
+ parser.add_argument("--num_inference_samples", type=int, default=4, help="Number of prompts to use for inference")
+ parser.add_argument("--num_inference_steps", type=int, default=48, help="Number of inference steps for generation")
+ parser.add_argument("--dataloader_num_workers", type=int, default=4)
+
+ return parser.parse_args()
+
+
+def save_video_frames(video, output_path, prompt_text=""):
+ """
+ Save video frames as a grid image and individual frames.
+
+ Args:
+ video: Can be list of PIL Images or torch.Tensor [C, F, H, W]
+ output_path: Base path for saving (without extension)
+ prompt_text: Prompt text for filename
+ """
+ import numpy as np
+ from PIL import Image
+
+ # Convert to list of PIL Images if needed
+ if isinstance(video, torch.Tensor):
+ # video: [C, F, H, W] in [0, 1]
+ C, F, H, W = video.shape
+ frames = []
+ for f in range(F):
+ frame = video[:, f, :, :].cpu().numpy() # [C, H, W]
+ frame = np.transpose(frame, (1, 2, 0)) # [H, W, C]
+ frame = (frame * 255).astype(np.uint8)
+ frames.append(Image.fromarray(frame))
+ video = frames
+ elif isinstance(video, list):
+ frames = video
+ else:
+ logger.warning(f"Unknown video type: {type(video)}")
+ return
+
+ if not frames:
+ logger.warning(f"No frames to save for {output_path}")
+ return
+
+ # Save grid of all frames
+ frames_tensor = torch.stack([transforms.ToTensor()(frame) for frame in frames], dim=0)
+ grid = make_grid(frames_tensor, nrow=min(4, len(frames)))
+ grid_path = f"{output_path}_grid.png"
+ save_image(grid, grid_path)
+ logger.info(f"Saved video grid to {grid_path}")
+
+ # Save individual frames
+ frames_dir = f"{output_path}_frames"
+ os.makedirs(frames_dir, exist_ok=True)
+ for i, frame in enumerate(frames):
+ frame_path = os.path.join(frames_dir, f"frame_{i:03d}.png")
+ frame.save(frame_path)
+
+ # Save as GIF
+ try:
+ gif_path = f"{output_path}.gif"
+ frames[0].save(
+ gif_path,
+ save_all=True,
+ append_images=frames[1:],
+ duration=200, # 200ms per frame
+ loop=0
+ )
+ logger.info(f"Saved video GIF to {gif_path}")
+ except Exception as e:
+ logger.warning(f"Failed to save GIF: {e}")
+
+
+def main():
+ args = parse_args()
+
+ # Set seed for reproducibility
+ set_seed(args.seed)
+
+ # Initialize accelerator
+ accelerator = Accelerator(
+ gradient_accumulation_steps=args.gradient_accumulation_steps,
+ mixed_precision=args.mixed_precision,
+ log_with=None, # Disable wandb for overfitting experiment
+ )
+
+ # Setup logging
+ logging.basicConfig(
+ format="%(asctime)s - %(levelname)s - %(name)s - %(message)s",
+ datefmt="%m/%d/%Y %H:%M:%S",
+ level=logging.INFO,
+ )
+ logger.info(accelerator.state, main_process_only=False)
+
+ if accelerator.is_local_main_process:
+ os.makedirs(args.output_dir, exist_ok=True)
+
+ # Initialize tokenizer and text encoder
+ logger.info("Initializing text encoder...")
+ if args.text_encoder_architecture in ["umt5-base", "t5"]:
+ tokenizer = T5Tokenizer.from_pretrained("google/umt5-base")
+ text_encoder = T5EncoderModel.from_pretrained("google/umt5-base")
+ else:
+ raise ValueError(f"Unsupported text encoder: {args.text_encoder_architecture}")
+
+ text_encoder.requires_grad_(False)
+ text_encoder.eval()
+ text_dim_actual = text_encoder.config.d_model
+
+ # Initialize video tokenizer
+ logger.info("Initializing video tokenizer...")
+ device = accelerator.device
+ dtype = torch.bfloat16 if args.mixed_precision == "bf16" else (torch.float16 if args.mixed_precision == "fp16" else torch.float32)
+ video_tokenizer = CosmosVideoTokenizer(
+ model_id=args.video_tokenizer_model_id,
+ device=device,
+ dtype=dtype,
+ )
+ video_tokenizer.requires_grad_(False)
+
+ # Calculate compressed dimensions
+ t_ds = video_tokenizer.t_downsample
+ h_ds = video_tokenizer.h_downsample
+ w_ds = video_tokenizer.w_downsample
+ F_prime = args.num_frames // t_ds
+ H_prime = args.video_height // h_ds
+ W_prime = args.video_width // w_ds
+
+ # Initialize transformer model
+ logger.info("Initializing transformer model...")
+
+ # Try to load Wan config from pretrained weights if provided
+ wan_config = None
+ if args.wan_pretrained_path:
+ try:
+ if os.path.isdir(args.wan_pretrained_path):
+ # Local directory
+ config_path = os.path.join(args.wan_pretrained_path, "config.json")
+ if os.path.exists(config_path):
+ import json
+ with open(config_path, 'r') as f:
+ wan_config = json.load(f)
+ else:
+ # HuggingFace Hub - try to load config
+ try:
+ from types import SimpleNamespace
+ temp_model = WanModel.from_pretrained(args.wan_pretrained_path, subfolder=None)
+ wan_config = SimpleNamespace(
+ dim=temp_model.dim,
+ ffn_dim=temp_model.ffn_dim,
+ num_layers=temp_model.num_layers,
+ num_heads=temp_model.num_heads,
+ freq_dim=temp_model.freq_dim,
+ in_dim=temp_model.in_dim,
+ out_dim=temp_model.out_dim,
+ text_dim=getattr(temp_model, 'text_dim', None),
+ )
+ del temp_model
+ except:
+ pass
+ except Exception as e:
+ logger.warning(f"Failed to load Wan config: {e}")
+
+ # Use Wan config if available, otherwise use defaults
+ if wan_config:
+ dim = wan_config.dim
+ ffn_dim = wan_config.ffn_dim
+ num_layers = wan_config.num_layers
+ num_heads = wan_config.num_heads
+ freq_dim = wan_config.freq_dim
+ in_dim = wan_config.in_dim
+ out_dim = wan_config.out_dim
+ text_dim_for_model = wan_config.text_dim if wan_config.text_dim else text_dim_actual
+ else:
+ # Default values
+ dim = 2048
+ ffn_dim = 8192
+ num_layers = 32
+ num_heads = 16
+ freq_dim = 256
+ in_dim = 16
+ out_dim = 16
+ text_dim_for_model = text_dim_actual
+
+ # Override text_dim with actual text encoder dimension
+ if text_dim_for_model != text_dim_actual:
+ logger.warning(f"Wan config text_dim ({text_dim_for_model}) != text encoder dim ({text_dim_actual}), using {text_dim_actual}")
+ text_dim_for_model = text_dim_actual
+
+ model = WanDiscreteVideoTransformer(
+ codebook_size=video_tokenizer.codebook_size,
+ vocab_size=video_tokenizer.codebook_size + 1,
+ num_frames=F_prime,
+ height=H_prime,
+ width=W_prime,
+ text_dim=text_dim_for_model,
+ dim=dim,
+ ffn_dim=ffn_dim,
+ num_layers=num_layers,
+ num_heads=num_heads,
+ freq_dim=freq_dim,
+ in_dim=in_dim,
+ out_dim=out_dim,
+ )
+
+ # Load pretrained weights if provided
+ if args.wan_pretrained_path:
+ logger.info(f"Loading pretrained weights from {args.wan_pretrained_path}...")
+ try:
+ if os.path.isdir(args.wan_pretrained_path):
+ state_dict_path = os.path.join(args.wan_pretrained_path, "diffusion_pytorch_model.safetensors")
+ if not os.path.exists(state_dict_path):
+ state_dict_path = os.path.join(args.wan_pretrained_path, "pytorch_model.bin")
+
+ if os.path.exists(state_dict_path):
+ from safetensors import safe_open
+ wan_state_dict = {}
+ if state_dict_path.endswith('.safetensors'):
+ with safe_open(state_dict_path, framework="pt", device="cpu") as f:
+ for k in f.keys():
+ wan_state_dict[k] = f.get_tensor(k)
+ else:
+ wan_state_dict = torch.load(state_dict_path, map_location="cpu")
+ else:
+ raise FileNotFoundError(f"State dict not found in {args.wan_pretrained_path}")
+ else:
+ # HuggingFace Hub
+ temp_model = WanModel.from_pretrained(args.wan_pretrained_path, subfolder=None)
+ wan_state_dict = temp_model.state_dict()
+ del temp_model
+
+ # Remove text_embedding weights if shape doesn't match
+ text_embedding_key = 'text_embedding.0.weight'
+ if text_embedding_key in wan_state_dict:
+ pretrained_text_dim = wan_state_dict[text_embedding_key].shape[1]
+ model_text_dim = model.backbone.text_embedding[0].weight.shape[1]
+
+ if pretrained_text_dim != model_text_dim:
+ keys_to_remove = [k for k in wan_state_dict.keys() if 'text_embedding' in k]
+ for k in keys_to_remove:
+ del wan_state_dict[k]
+ logger.info(f"Removed {len(keys_to_remove)} text_embedding keys due to dimension mismatch")
+
+ missing_keys, unexpected_keys = model.backbone.load_state_dict(wan_state_dict, strict=False)
+ if missing_keys:
+ logger.warning(f"Missing keys: {missing_keys[:10]}...")
+ if unexpected_keys:
+ logger.warning(f"Unexpected keys: {unexpected_keys[:10]}...")
+ logger.info("Successfully loaded pretrained weights")
+ except Exception as e:
+ logger.warning(f"Failed to load pretrained weights: {e}")
+
+ # Initialize scheduler
+ logger.info("Initializing scheduler...")
+ scheduler = Scheduler(
+ mask_token_id=video_tokenizer.mask_token_id,
+ masking_schedule="cosine",
+ )
+
+ # Setup optimizer
+ logger.info("Setting up optimizer...")
+ optimizer = torch.optim.AdamW(
+ model.parameters(),
+ lr=args.learning_rate,
+ betas=(0.9, 0.999),
+ weight_decay=0.01,
+ eps=1e-8,
+ )
+
+ # Simple constant scheduler with warmup
+ def lr_lambda(current_step):
+ if current_step < args.lr_warmup_steps:
+ return float(current_step) / float(max(1, args.lr_warmup_steps))
+ return 1.0
+
+ lr_scheduler = torch.optim.lr_scheduler.LambdaLR(optimizer, lr_lambda)
+
+ # Create tiny dataset
+ logger.info(f"Creating tiny dataset with {args.max_samples} samples...")
+
+ # Auto-detect video_root_dir if not provided
+ csv_path = args.instance_data_dir
+ csv_dir = os.path.dirname(csv_path)
+ if os.path.exists(os.path.join(csv_dir, 'video_reorg')):
+ video_root_dir = os.path.join(csv_dir, 'video_reorg')
+ elif os.path.exists(os.path.join(os.path.dirname(csv_dir), 'video_reorg')):
+ video_root_dir = os.path.join(os.path.dirname(csv_dir), 'video_reorg')
+ else:
+ # Fallback: use CSV directory
+ video_root_dir = csv_dir
+ logger.warning(f"Video directory not found, using CSV directory: {video_root_dir}")
+
+ dataset = TinyOpenVid1MDataset(
+ csv_path=csv_path,
+ video_root_dir=video_root_dir,
+ tokenizer=tokenizer,
+ num_frames=args.num_frames,
+ height=args.video_height,
+ width=args.video_width,
+ text_encoder_architecture=args.text_encoder_architecture,
+ max_samples=args.max_samples,
+ seed=args.seed,
+ )
+
+ # Create dataloader
+ train_dataloader = DataLoader(
+ dataset,
+ batch_size=args.train_batch_size,
+ shuffle=True,
+ num_workers=args.dataloader_num_workers,
+ collate_fn=default_collate,
+ pin_memory=True,
+ prefetch_factor=2 if args.dataloader_num_workers > 0 else None,
+ persistent_workers=args.dataloader_num_workers > 0,
+ )
+
+ logger.info(f"Dataset size: {len(dataset)}")
+ logger.info(f"Dataloader batches: {len(train_dataloader)}")
+
+ # Enable gradient checkpointing if requested (before prepare)
+ if args.gradient_checkpointing:
+ model.enable_gradient_checkpointing()
+
+ # Prepare with accelerator
+ model, optimizer, lr_scheduler, train_dataloader, text_encoder = accelerator.prepare(
+ model, optimizer, lr_scheduler, train_dataloader, text_encoder
+ )
+
+ # Training loop
+ logger.info("Starting overfitting experiment...")
+ logger.info(f"Target: Loss should drop to 5-6 or lower within {args.max_train_steps} steps")
+ logger.info(f"If loss stays high (9-10) or diverges, there's likely a bug")
+
+ model.train()
+ global_step = 0
+
+ for epoch in range(1000): # Large number, will break on step limit
+ for batch in train_dataloader:
+ with accelerator.accumulate(model):
+ # Get video and encode
+ video_values = batch["video"].to(device=accelerator.device, dtype=torch.float32)
+ video_tokens = video_tokenizer.encode(video_values) # [B, F', H', W']
+
+ # Flatten for masking
+ B, F_prime_vid, H_prime_vid, W_prime_vid = video_tokens.shape
+ video_tokens_flat = video_tokens.view(B, -1)
+ seq_len = video_tokens_flat.shape[1]
+
+ # Apply masking: use per-sample masking like train_mei_video.py
+ mask_prob = torch.rand(B, device=video_tokens_flat.device) * 0.5 + 0.1 # 0.1 to 0.6 per sample
+ num_token_masked = (seq_len * mask_prob).round().clamp(min=1)
+ batch_randperm = torch.rand(B, seq_len, device=video_tokens_flat.device).argsort(dim=-1)
+ mask = batch_randperm < num_token_masked.unsqueeze(-1)
+
+ # Create input_ids: masked positions get mask_token_id, others keep original tokens
+ mask_id = video_tokenizer.mask_token_id # codebook_size
+ input_ids_flat = torch.where(mask, mask_id, video_tokens_flat)
+ # Create labels: masked positions get original tokens (for loss), others get -100 (ignored)
+ labels_flat = torch.where(mask, video_tokens_flat, -100)
+
+ # Reshape back to [B, F', H', W'] for model forward
+ input_ids = input_ids_flat.view(B, F_prime_vid, H_prime_vid, W_prime_vid)
+ labels = labels_flat.view(B, F_prime_vid, H_prime_vid, W_prime_vid)
+
+ # Use average mask ratio for timestep
+ mask_ratio = mask_prob.mean().item()
+
+ # Encode text
+ encoder_hidden_states, cond_embeds = encode_prompt(
+ text_encoder,
+ batch["prompt_input_ids"].to(device=accelerator.device),
+ args.text_encoder_architecture
+ )
+
+ # Forward pass
+ logits = model(
+ tokens=input_ids,
+ timesteps=torch.full((B,), int(mask_ratio * 1000), device=accelerator.device, dtype=torch.long),
+ encoder_hidden_states=encoder_hidden_states,
+ y=None,
+ )
+
+ # Reshape for loss
+ # logits: [B, vocab_size, F', H', W'] -> [B*F'*H'*W', vocab_size]
+ B_logits, vocab_size, F_prime_logits, H_prime_logits, W_prime_logits = logits.shape
+ logits = logits.permute(0, 2, 3, 4, 1).reshape(B_logits * F_prime_logits * H_prime_logits * W_prime_logits, vocab_size)
+
+ # labels: [B, F', H', W'] - crop to match logits dimensions if needed
+ B_labels, F_prime_labels, H_prime_labels, W_prime_labels = labels.shape
+ assert B_logits == B_labels, f"Batch size mismatch: logits {B_logits} vs labels {B_labels}"
+
+ # Crop labels to match logits spatial dimensions
+ if F_prime_labels != F_prime_logits or H_prime_labels != H_prime_logits or W_prime_labels != W_prime_logits:
+ labels = labels[:, :F_prime_logits, :H_prime_logits, :W_prime_logits]
+
+ # labels: [B, F', H', W'] -> [B*F'*H'*W']
+ labels_flat = labels.reshape(-1)
+
+ # Verify label values are in valid range [0, codebook_size-1] or -100 (ignored)
+ codebook_size = video_tokenizer.codebook_size
+ valid_labels = labels_flat[(labels_flat >= 0) & (labels_flat != -100)]
+ if len(valid_labels) > 0:
+ assert valid_labels.min() >= 0 and valid_labels.max() < codebook_size, (
+ f"Label values out of range: min={valid_labels.min()}, max={valid_labels.max()}, "
+ f"expected [0, {codebook_size-1}]"
+ )
+
+ # Compute loss: only on masked positions (labels != -100), ignore unmasked positions
+ # vocab_size = codebook_size + 1 (includes mask_token_id = codebook_size)
+ # labels are in [0, codebook_size-1] range (Cosmos tokens), which map directly to logits indices [0, codebook_size-1]
+ loss = F.cross_entropy(
+ logits,
+ labels_flat,
+ ignore_index=-100, # Ignore unmasked positions
+ reduction="mean",
+ )
+
+ # Backward
+ accelerator.backward(loss)
+ if accelerator.sync_gradients:
+ accelerator.clip_grad_norm_(model.parameters(), 1.0)
+ optimizer.step()
+ lr_scheduler.step()
+ optimizer.zero_grad()
+
+ if accelerator.sync_gradients:
+ global_step += 1
+
+ if global_step % args.logging_steps == 0:
+ logger.info(f"Step {global_step}/{args.max_train_steps}, Loss: {loss.item():.4f}, LR: {lr_scheduler.get_last_lr()[0]:.2e}")
+
+ if global_step % args.save_steps == 0:
+ if accelerator.is_main_process:
+ checkpoint_path = os.path.join(args.output_dir, f"checkpoint-{global_step}")
+ os.makedirs(checkpoint_path, exist_ok=True)
+ unwrapped_model = accelerator.unwrap_model(model)
+ unwrapped_model.save_pretrained(checkpoint_path)
+ logger.info(f"Saved checkpoint to {checkpoint_path}")
+
+ # Inference: generate videos using training data prompts
+ if global_step % args.inference_steps == 0 and global_step > 0:
+ if accelerator.is_main_process:
+ logger.info(f"Step {global_step}: Generating videos for inference...")
+
+ # Sample prompts from training dataset (get original captions from dataset.data)
+ inference_indices = torch.randperm(len(dataset), generator=torch.Generator().manual_seed(args.seed))[:args.num_inference_samples].tolist()
+ inference_prompts = []
+ for idx in inference_indices:
+ # Get original caption from dataset
+ row = dataset.data[idx]
+ prompt_text = row['caption']
+ if dataset.prompt_prefix is not None:
+ prompt_text = dataset.prompt_prefix + prompt_text
+ inference_prompts.append(prompt_text)
+
+ logger.info(f"Using prompts: {inference_prompts[:2]}...") # Log first 2 prompts
+
+ try:
+ # Create inference pipeline
+ model.eval()
+ text_encoder.eval()
+
+ # Get unwrapped model and ensure correct dtype
+ unwrapped_model = accelerator.unwrap_model(model)
+ unwrapped_text_encoder = accelerator.unwrap_model(text_encoder)
+ weight_dtype = torch.bfloat16 if args.mixed_precision == "bf16" else (torch.float16 if args.mixed_precision == "fp16" else torch.float32)
+ unwrapped_model = unwrapped_model.to(dtype=weight_dtype)
+ unwrapped_text_encoder = unwrapped_text_encoder.to(dtype=weight_dtype)
+
+ # Create scheduler for inference
+ inference_scheduler = Scheduler(
+ mask_token_id=video_tokenizer.mask_token_id,
+ masking_schedule="cosine"
+ )
+ inference_scheduler.set_timesteps(
+ num_inference_steps=args.num_inference_steps,
+ device=accelerator.device
+ )
+
+ # Create pipeline
+ pipe = VideoPipeline(
+ tokenizer=tokenizer,
+ text_encoder=unwrapped_text_encoder,
+ transformer=unwrapped_model,
+ scheduler=inference_scheduler,
+ video_tokenizer=video_tokenizer,
+ text_len=512,
+ num_frames=args.num_frames,
+ height=args.video_height,
+ width=args.video_width,
+ )
+ pipe = pipe.to(accelerator.device)
+
+ # Generate videos
+ with torch.no_grad():
+ videos = pipe(
+ prompt=inference_prompts,
+ num_frames=args.num_frames,
+ height=args.video_height,
+ width=args.video_width,
+ guidance_scale=9.0,
+ num_inference_steps=args.num_inference_steps,
+ output_type="pil",
+ ).videos
+
+ # Save videos
+ inference_dir = os.path.join(args.output_dir, f"inference_step_{global_step}")
+ os.makedirs(inference_dir, exist_ok=True)
+
+ for i, (video, prompt) in enumerate(zip(videos, inference_prompts)):
+ # Sanitize prompt for filename (remove special chars, limit length)
+ safe_prompt = "".join(c for c in prompt[:50] if c.isalnum() or c in (' ', '-', '_')).strip().replace(' ', '_')[:30]
+ if not safe_prompt:
+ safe_prompt = f"prompt_{i}"
+ output_path = os.path.join(inference_dir, f"step{global_step}_video{i}_{safe_prompt}")
+ save_video_frames(video, output_path, prompt_text=prompt)
+
+ logger.info(f"Saved inference videos to {inference_dir}")
+ logger.info(f"=" * 80)
+ logger.info(f"Step {global_step} Inference Results:")
+ logger.info(f" - Videos saved to: {inference_dir}")
+ logger.info(f" - Prompts used: {inference_prompts}")
+ logger.info(f" - Check videos to observe:")
+ logger.info(f" * Structure: When do coherent shapes/objects appear?")
+ logger.info(f" * Motion: When does temporal consistency emerge?")
+ logger.info(f" * Condition alignment: When do videos match prompts?")
+ logger.info(f"=" * 80)
+
+ # Set model back to training mode
+ model.train()
+
+ except Exception as e:
+ logger.error(f"Inference failed at step {global_step}: {e}")
+ import traceback
+ traceback.print_exc()
+ model.train() # Ensure model is back in training mode
+
+ if global_step >= args.max_train_steps:
+ break
+
+ if global_step >= args.max_train_steps:
+ break
+
+ logger.info("Overfitting experiment completed!")
+ logger.info(f"Final loss: {loss.item():.4f}")
+ logger.info("If loss dropped to 5-6 or lower, implementation is likely correct.")
+ logger.info("If loss stayed high (9-10) or diverged, check for bugs in mask_token, scheduler, or label alignment.")
+
+
+if __name__ == "__main__":
+ main()
+
diff --git a/Meissonic/train/train_video.sh b/Meissonic/train/train_video.sh
new file mode 100644
index 0000000000000000000000000000000000000000..1b022fe8de1780ad1e75eecbb08f6dadb7077113
--- /dev/null
+++ b/Meissonic/train/train_video.sh
@@ -0,0 +1,82 @@
+#!/bin/bash
+# 8-GPU training script for video diffusion model
+# Usage: bash train/train_video.sh
+
+accelerate launch --multi_gpu --gpu_ids '0,1,2,3,4,5,6,7' --main_process_port 25011 --num_processes 8 \
+ train/train_mei_video.py \
+ --use_precomputed_features \
+ --features_dir /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128_with_mask_attn_debug \
+ --text_encoder_architecture umt5-xxl \
+ --wan_pretrained_path /mnt/Wan2.1-T2V-1.3B \
+ --training_from_scratch True \
+ --pretrained_model_name_or_path "dummy" \
+ --wan_backbone_lr_ratio 0.2 \
+ --num_frames 17 \
+ --video_height 128 \
+ --video_width 128 \
+ --dataloader_num_workers 8 \
+ --video_tokenizer_model_id "Cosmos-0.1-Tokenizer-DV4x8x8" \
+ --instance_dataset OpenVid1MDataset \
+ --instance_data_dir "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv" \
+ --train_batch_size 8 \
+ --gradient_accumulation_steps 4 \
+ --learning_rate 3e-4 \
+ --max_train_steps 10000 \
+ --checkpointing_steps 500 \
+ --validation_steps 500 \
+ --logging_steps 10 \
+ --validation_prompts "a cat playing" "a girl walking" \
+ --output_dir "./output_128x128_17f_2*4bs_4*8*8vqvae_0_2_ratio" \
+ --mixed_precision bf16 \
+ --lr_scheduler constant \
+ --lr_warmup_steps 0 \
+ --use_8bit_adam \
+ --gradient_checkpointing \
+ --min_masking_rate 0.0 \
+ --cond_dropout_prob 0.1 \
+ --split_vae_encode 1 \
+ --allow_tf32 \
+ --seed 42 \
+ --report_to wandb
+
+ # --use_precomputed_features \
+ # --features_dir /mnt/VideoGen/dataset/OpenVid1M/extracted_features \
+
+# accelerate launch --multi_gpu --gpu_ids '0,1,2,3,4,5,6,7' --main_process_port 25011 --num_processes 8 \
+# train/train_mei_video.py \
+# --use_precomputed_features \
+# --features_dir /mnt/VideoGen/dataset/OpenVid1M/extracted_features \
+# --text_encoder_architecture umt5-xxl \
+# --wan_pretrained_path Wan-AI/Wan2.1-T2V-1.3B \
+# --training_from_scratch True \
+# --pretrained_model_name_or_path "dummy" \
+# --wan_backbone_lr_ratio 1 \
+# --num_frames 4 \
+# --video_height 256 \
+# --video_width 448 \
+# --dataloader_num_workers 8 \
+# --video_tokenizer_model_id "Cosmos-0.1-Tokenizer-DV4x8x8" \
+# --instance_dataset OpenVid1MDataset \
+# --instance_data_dir "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv" \
+# --train_batch_size 1 \
+# --gradient_accumulation_steps 1 \
+# --learning_rate 3e-4 \
+# --max_train_steps 10000 \
+# --checkpointing_steps 500 \
+# --validation_steps 500 \
+# --logging_steps 10 \
+# --validation_prompts "a cat playing" "a girl walking" \
+# --output_dir "./output_256x448_4f_2bs_4*8*8vqvae_0_00_ratio_continue_tmp" \
+# --mixed_precision bf16 \
+# --lr_scheduler constant \
+# --lr_warmup_steps 0 \
+# --use_8bit_adam \
+# --gradient_checkpointing \
+# --min_masking_rate 0.0 \
+# --cond_dropout_prob 0.1 \
+# --split_vae_encode 1 \
+# --allow_tf32 \
+# --seed 42 \
+# --report_to wandb
+
+# --pretrained_model_name_or_path "/mnt/Meissonic/output_256x448_4f_2bs_4*8*8vqvae_0_00_ratio/checkpoint-4000" \
\ No newline at end of file
diff --git a/Meissonic/train/train_video_stage_1.sh b/Meissonic/train/train_video_stage_1.sh
new file mode 100644
index 0000000000000000000000000000000000000000..d40479f24c275e43d3ccbb2d9ffa05a3915b6e6d
--- /dev/null
+++ b/Meissonic/train/train_video_stage_1.sh
@@ -0,0 +1,83 @@
+
+
+# Overall Batch size: 8 (per GPU) * 4 (gradient accumulation) * 96 (GPU cards) = 3072
+# We set max steps to 100k, but we may not need so much, depends on actual performace
+# typically 20s for 10 steps, so we can run 1800 steps per hour and 43.2k steps per day, which equals to 132 epochs
+
+
+export WAN_DISABLE_FLASH_ATTN=1
+
+accelerate launch --multi_gpu --gpu_ids '0,1,2,3,4,5,6,7' --main_process_port 25011 --num_processes 8 \
+ train/train_mei_video.py \
+ --use_precomputed_video_only \
+ --features_dir /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128 \
+ --text_encoder_architecture umt5-xxl \
+ --wan_pretrained_path /mnt/Wan2.1-T2V-1.3B \
+ --training_from_scratch \
+ --pretrained_model_name_or_path "dummy" \
+ --wan_backbone_lr_ratio 0.2 \
+ --num_frames 17 \
+ --video_height 128 \
+ --video_width 128 \
+ --dataloader_num_workers 8 \
+ --video_tokenizer_model_id "Cosmos-0.1-Tokenizer-DV4x8x8" \
+ --instance_dataset OpenVid1MDataset \
+ --instance_data_dir "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv" \
+ --train_batch_size 1 \
+ --gradient_accumulation_steps 1 \
+ --learning_rate 3e-3 \
+ --max_train_steps 100000 \
+ --checkpointing_steps 500 \
+ --validation_steps 100 \
+ --logging_steps 10 \
+ --validation_prompts "a cat playing" "a girl walking" "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution." \
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner." \
+ --output_dir "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3" \
+ --mixed_precision bf16 \
+ --lr_scheduler constant \
+ --lr_warmup_steps 0 \
+ --use_8bit_adam \
+ --gradient_checkpointing \
+ --min_masking_rate 0.0 \
+ --cond_dropout_prob 0.0 \
+ --split_vae_encode 1 \
+ --allow_tf32 \
+ --seed 42 \
+ --report_to wandb
+
+# accelerate launch --multi_gpu --gpu_ids '0,1,2,3,4,5,6,7' --main_process_port 25011 --num_processes 8 \
+# train/train_mei_video.py \
+# --use_precomputed_features \
+# --features_dir /mnt/VideoGen/dataset/OpenVid1M/extracted_features \
+# --text_encoder_architecture umt5-xxl \
+# --wan_pretrained_path /mnt/Wan2.1-T2V-1.3B \
+# --training_from_scratch True \
+# --pretrained_model_name_or_path "dummy" \
+# --wan_backbone_lr_ratio 0.2 \
+# --num_frames 17 \
+# --video_height 128 \
+# --video_width 128 \
+# --dataloader_num_workers 8 \
+# --video_tokenizer_model_id "Cosmos-0.1-Tokenizer-DV4x8x8" \
+# --instance_dataset OpenVid1MDataset \
+# --instance_data_dir "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv" \
+# --train_batch_size 8 \
+# --gradient_accumulation_steps 4 \
+# --learning_rate 3e-4 \
+# --max_train_steps 100000 \
+# --checkpointing_steps 500 \
+# --validation_steps 500 \
+# --logging_steps 10 \
+# --validation_prompts "a cat playing" "a girl walking" \
+# --output_dir "./output_128x128_17f_8*4bs_4*8*8vqvae_0_2_ratio" \
+# --mixed_precision bf16 \
+# --lr_scheduler constant \
+# --lr_warmup_steps 0 \
+# --use_8bit_adam \
+# --gradient_checkpointing \
+# --min_masking_rate 0.0 \
+# --cond_dropout_prob 0.1 \
+# --split_vae_encode 1 \
+# --allow_tf32 \
+# --seed 42 \
+# --report_to wandb
\ No newline at end of file
diff --git a/Meissonic/train/trainer_utils.py b/Meissonic/train/trainer_utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..a4956a83dd94a140cab80c0baedd906713523f82
--- /dev/null
+++ b/Meissonic/train/trainer_utils.py
@@ -0,0 +1,45 @@
+# Copyright 2024 The HuggingFace Team and The MeissonFlow Team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import shutil
+from pathlib import Path
+
+
+def save_checkpoint(args, accelerator, global_step, logger):
+ output_dir = args.output_dir
+
+ # _before_ saving state, check if this save would set us over the `checkpoints_total_limit`
+ if accelerator.is_main_process and args.checkpoints_total_limit is not None:
+ checkpoints = os.listdir(output_dir)
+ checkpoints = [d for d in checkpoints if d.startswith("checkpoint")]
+ checkpoints = sorted(checkpoints, key=lambda x: int(x.split("-")[1]))
+
+ # before we save the new checkpoint, we need to have at _most_ `checkpoints_total_limit - 1` checkpoints
+ if len(checkpoints) >= args.checkpoints_total_limit:
+ num_to_remove = len(checkpoints) - args.checkpoints_total_limit + 1
+ removing_checkpoints = checkpoints[0:num_to_remove]
+
+ logger.info(
+ f"{len(checkpoints)} checkpoints already exist, removing {len(removing_checkpoints)} checkpoints"
+ )
+ logger.info(f"removing checkpoints: {', '.join(removing_checkpoints)}")
+
+ for removing_checkpoint in removing_checkpoints:
+ removing_checkpoint = os.path.join(output_dir, removing_checkpoint)
+ shutil.rmtree(removing_checkpoint)
+
+ save_path = Path(output_dir) / f"checkpoint-{global_step}"
+ accelerator.save_state(save_path)
+ logger.info(f"Saved state to {save_path}")
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/config.yaml b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10100_b2eb0bb48e9b123906cb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10100_b2eb0bb48e9b123906cb.png
new file mode 100644
index 0000000000000000000000000000000000000000..522e6a6bb3e25598d84024cb9e3a3aa7498f39cd
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10100_b2eb0bb48e9b123906cb.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10100_c59286dde818ff401b6a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10100_c59286dde818ff401b6a.png
new file mode 100644
index 0000000000000000000000000000000000000000..0a00bb3cf55ed5d1d146067b0654dae1e3f61150
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10100_c59286dde818ff401b6a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10100_d7edfc7bd3a3f1aa23df.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10100_d7edfc7bd3a3f1aa23df.png
new file mode 100644
index 0000000000000000000000000000000000000000..6458fe8e862d6b5b23f7b4456565eed919cda394
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10100_d7edfc7bd3a3f1aa23df.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10100_fdafe375c14500136dd8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10100_fdafe375c14500136dd8.png
new file mode 100644
index 0000000000000000000000000000000000000000..473fe86ccc4ad110ced00a1850e90cf01683dab5
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10100_fdafe375c14500136dd8.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10300_16c15869973ee1b5228d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10300_16c15869973ee1b5228d.png
new file mode 100644
index 0000000000000000000000000000000000000000..203c170291e865d279b803fda040383c0859647e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10300_16c15869973ee1b5228d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10300_207b2cca76cf64f121a3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10300_207b2cca76cf64f121a3.png
new file mode 100644
index 0000000000000000000000000000000000000000..2ac56a959c2a42a4837bc82c8484d10a0b677db3
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10300_207b2cca76cf64f121a3.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10300_83d0cf2f70f7ea389d56.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10300_83d0cf2f70f7ea389d56.png
new file mode 100644
index 0000000000000000000000000000000000000000..eb161d0c1f59b07c4f25508793a0b04bee2c77fb
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10300_83d0cf2f70f7ea389d56.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10300_c0efab1d11ea6790d325.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10300_c0efab1d11ea6790d325.png
new file mode 100644
index 0000000000000000000000000000000000000000..1fb5b0c68232ea6831bfb2f6c08fc9c33ee38fa9
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10300_c0efab1d11ea6790d325.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10500_364ac13c23f2b7067ad8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10500_364ac13c23f2b7067ad8.png
new file mode 100644
index 0000000000000000000000000000000000000000..2e1b2371674579ef700568aaa3242e6e78ca010f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10500_364ac13c23f2b7067ad8.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10500_3f0acd11e899c31362f7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10500_3f0acd11e899c31362f7.png
new file mode 100644
index 0000000000000000000000000000000000000000..4dff9b68b23bc895f28b85b6e6a0d9c18b0396e5
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10500_3f0acd11e899c31362f7.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10500_927ccc485ef19492a8b3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10500_927ccc485ef19492a8b3.png
new file mode 100644
index 0000000000000000000000000000000000000000..71e25b1e1cc8159fa83869635e9978c87e576822
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10500_927ccc485ef19492a8b3.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10500_a28cdd5a2c11855e9f4e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10500_a28cdd5a2c11855e9f4e.png
new file mode 100644
index 0000000000000000000000000000000000000000..796104ae0212336ba947a53173322e8838046ac1
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10500_a28cdd5a2c11855e9f4e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10700_3d6456980401d6a1c171.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10700_3d6456980401d6a1c171.png
new file mode 100644
index 0000000000000000000000000000000000000000..bde13b6764d53e567a5b6de9bd301b593867c3a8
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10700_3d6456980401d6a1c171.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10700_91655412b9269e4f10a4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10700_91655412b9269e4f10a4.png
new file mode 100644
index 0000000000000000000000000000000000000000..26d2f8e9346495fa65e48c64f88a5f9b1ae93590
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10700_91655412b9269e4f10a4.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10700_9ee1cce95e3ff32d3157.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10700_9ee1cce95e3ff32d3157.png
new file mode 100644
index 0000000000000000000000000000000000000000..fe94f40dd5d1a2e7838a253fb23a87acb85603de
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10700_9ee1cce95e3ff32d3157.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10700_c1f68b8d088a043612e1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10700_c1f68b8d088a043612e1.png
new file mode 100644
index 0000000000000000000000000000000000000000..525604c7d98c697189b640758b9f0c8f10550553
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10700_c1f68b8d088a043612e1.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10900_029ef7e804ed3f057ac6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10900_029ef7e804ed3f057ac6.png
new file mode 100644
index 0000000000000000000000000000000000000000..ca0dd18e88c142bdd760e2a58c0e2283f1c0d884
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10900_029ef7e804ed3f057ac6.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10900_7c52045b4c85a743dbde.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10900_7c52045b4c85a743dbde.png
new file mode 100644
index 0000000000000000000000000000000000000000..5e1fd20205c962771b3820882510d2102c358a37
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10900_7c52045b4c85a743dbde.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10900_bb2565322c91043513d1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10900_bb2565322c91043513d1.png
new file mode 100644
index 0000000000000000000000000000000000000000..745ca28c0fc5606b922877e262474a6fa3406238
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10900_bb2565322c91043513d1.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10900_d2ecafecec366bc5470d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10900_d2ecafecec366bc5470d.png
new file mode 100644
index 0000000000000000000000000000000000000000..f5e5ae82b552f0dbe4703c25caa36b8264dffcf2
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_10900_d2ecafecec366bc5470d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1100_063358310384c39f10c6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1100_063358310384c39f10c6.png
new file mode 100644
index 0000000000000000000000000000000000000000..f4c052ffab6149d398742c53f861230e86252e72
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1100_063358310384c39f10c6.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1100_5b444afe385143fc89ea.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1100_5b444afe385143fc89ea.png
new file mode 100644
index 0000000000000000000000000000000000000000..b3fd0a97be38b1a723b94894e7831abc23be2aa9
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1100_5b444afe385143fc89ea.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1100_606b8ff3dbcef70a2a5f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1100_606b8ff3dbcef70a2a5f.png
new file mode 100644
index 0000000000000000000000000000000000000000..2416a5649f68a87458394f1ad33c0e97d772b9d8
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1100_606b8ff3dbcef70a2a5f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1100_efad9c818ba1bf73c1ae.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1100_efad9c818ba1bf73c1ae.png
new file mode 100644
index 0000000000000000000000000000000000000000..85996d7e86bd0fdaaea9c2cb7a369be6d0b78a09
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1100_efad9c818ba1bf73c1ae.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11100_0dddfc1e019141b31b3c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11100_0dddfc1e019141b31b3c.png
new file mode 100644
index 0000000000000000000000000000000000000000..74c1b8c3ec48242389b0dc0b68a64ffcb4ee62fc
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11100_0dddfc1e019141b31b3c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11100_2b14808f903801abb8a5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11100_2b14808f903801abb8a5.png
new file mode 100644
index 0000000000000000000000000000000000000000..fb5b9152e10dfcb6f3ff46c2829f8fe367391b31
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11100_2b14808f903801abb8a5.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11100_6505d21488aa6f9a354f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11100_6505d21488aa6f9a354f.png
new file mode 100644
index 0000000000000000000000000000000000000000..c823b68cd481f2100018900daf9112076b8746c2
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11100_6505d21488aa6f9a354f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11100_cdbd5a23643867af9e64.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11100_cdbd5a23643867af9e64.png
new file mode 100644
index 0000000000000000000000000000000000000000..eca50dcd7f636f470767a689866717c33c3c97b0
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11100_cdbd5a23643867af9e64.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11300_18e61bb96a6205b22842.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11300_18e61bb96a6205b22842.png
new file mode 100644
index 0000000000000000000000000000000000000000..769547fa6ba04695dbea54f6362a20689a8a1467
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11300_18e61bb96a6205b22842.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11300_1afd81b66bf52f6de906.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11300_1afd81b66bf52f6de906.png
new file mode 100644
index 0000000000000000000000000000000000000000..e11950ed5b6adfc52a22e499a7989348b859001c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11300_1afd81b66bf52f6de906.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11300_8b258aa489e979ffe137.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11300_8b258aa489e979ffe137.png
new file mode 100644
index 0000000000000000000000000000000000000000..ead4062f47d3def4d9a296e0199dc7b5fc812768
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11300_8b258aa489e979ffe137.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11300_b415e822ad3e9320e02b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11300_b415e822ad3e9320e02b.png
new file mode 100644
index 0000000000000000000000000000000000000000..f03fd02c216c9959f0a8d260ba5111a70aff1e16
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11300_b415e822ad3e9320e02b.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11500_008949055d0818a5061e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11500_008949055d0818a5061e.png
new file mode 100644
index 0000000000000000000000000000000000000000..7a0952e55e65d196fc0c94ac16e0010ee819c8e9
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11500_008949055d0818a5061e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11500_60196df0086a64d54a2a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11500_60196df0086a64d54a2a.png
new file mode 100644
index 0000000000000000000000000000000000000000..92115af8b4b8bed01269b19dffdcaf9a3486e916
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11500_60196df0086a64d54a2a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11500_947d024455828ec339aa.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11500_947d024455828ec339aa.png
new file mode 100644
index 0000000000000000000000000000000000000000..1e92646d4e38b65d7057a5ed681a7abf7e581609
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11500_947d024455828ec339aa.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11500_fa5a1b3190042562e46f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11500_fa5a1b3190042562e46f.png
new file mode 100644
index 0000000000000000000000000000000000000000..e1586950cf7aa0643ffbf22bcb01703545c8e35c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11500_fa5a1b3190042562e46f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11700_33a52cce1a8f3ed58619.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11700_33a52cce1a8f3ed58619.png
new file mode 100644
index 0000000000000000000000000000000000000000..cfd0959d7f3c95cdb848a41bee64379deb321537
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11700_33a52cce1a8f3ed58619.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11700_36022b35c75233684a2c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11700_36022b35c75233684a2c.png
new file mode 100644
index 0000000000000000000000000000000000000000..09d87458028fbfed958937cacb89aebabe0af531
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11700_36022b35c75233684a2c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:36022b35c75233684a2ce180b3af42413007f7ff3e033a515d010fb39d49a655
+size 119493
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11700_83db6808049b4c29280e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11700_83db6808049b4c29280e.png
new file mode 100644
index 0000000000000000000000000000000000000000..5c8439679537649a0b6a852e5af5395aae0335fd
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11700_83db6808049b4c29280e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11700_8f43065b6d7dfb06227c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11700_8f43065b6d7dfb06227c.png
new file mode 100644
index 0000000000000000000000000000000000000000..cde5cd1d4e9d352eae4f5b147cc38d558508d495
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11700_8f43065b6d7dfb06227c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11900_4f170af09f6de475ad0c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11900_4f170af09f6de475ad0c.png
new file mode 100644
index 0000000000000000000000000000000000000000..74dccad3eab027f182a235f159215bb8b1128daf
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11900_4f170af09f6de475ad0c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11900_8fb3d3d864bf938ab2eb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11900_8fb3d3d864bf938ab2eb.png
new file mode 100644
index 0000000000000000000000000000000000000000..01c451a1eb7d1b6423c234eb65bc1e2cc4438f75
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11900_8fb3d3d864bf938ab2eb.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11900_e6861c591d7f3977e707.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11900_e6861c591d7f3977e707.png
new file mode 100644
index 0000000000000000000000000000000000000000..4af6e893a44492de02a0ea6b9120d81afebc4ba6
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11900_e6861c591d7f3977e707.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11900_fb700bfbbbc002fd4cbd.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11900_fb700bfbbbc002fd4cbd.png
new file mode 100644
index 0000000000000000000000000000000000000000..eab84a8149f7bac191f91dc3e5fa473efeaec8b5
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_11900_fb700bfbbbc002fd4cbd.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12100_33054eff0bb01abd07a2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12100_33054eff0bb01abd07a2.png
new file mode 100644
index 0000000000000000000000000000000000000000..7f3a57e09a29282e08f39696cea42f4f0beb14f4
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12100_33054eff0bb01abd07a2.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12100_bd0848d4f3da88d701d4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12100_bd0848d4f3da88d701d4.png
new file mode 100644
index 0000000000000000000000000000000000000000..55fceba76e212963a3fefa97376c38d25875b7d4
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12100_bd0848d4f3da88d701d4.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12100_c712032385203395f09e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12100_c712032385203395f09e.png
new file mode 100644
index 0000000000000000000000000000000000000000..7586188f1d612ce06644790cfa26b2bb9d94ba34
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12100_c712032385203395f09e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12100_ecb119478bc9e3734f2b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12100_ecb119478bc9e3734f2b.png
new file mode 100644
index 0000000000000000000000000000000000000000..e65abdae6e5fce93dd9130f7d696f6229e481e29
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12100_ecb119478bc9e3734f2b.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12300_0f45347819eb7e7ea9fc.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12300_0f45347819eb7e7ea9fc.png
new file mode 100644
index 0000000000000000000000000000000000000000..7e9a0d3c96ee86ab3dd09fa5e9585ddfb79f9782
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12300_0f45347819eb7e7ea9fc.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12300_a36c5ddcb12560df4821.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12300_a36c5ddcb12560df4821.png
new file mode 100644
index 0000000000000000000000000000000000000000..9a061c7eaea58760d7e4415e5df4848def9789e3
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12300_a36c5ddcb12560df4821.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12300_d5db30893e36ddc8a1fd.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12300_d5db30893e36ddc8a1fd.png
new file mode 100644
index 0000000000000000000000000000000000000000..a9f3693188fcaa43aeadded3c9b2c7c11a491788
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12300_d5db30893e36ddc8a1fd.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12300_dcd0d2f2e0b5a5e614f9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12300_dcd0d2f2e0b5a5e614f9.png
new file mode 100644
index 0000000000000000000000000000000000000000..3cc6ca9ac2cbc4c9ff85d7db703098de891b6e66
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12300_dcd0d2f2e0b5a5e614f9.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12500_2f15ed262b53ac563b1a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12500_2f15ed262b53ac563b1a.png
new file mode 100644
index 0000000000000000000000000000000000000000..a97c5e8e7682b06809fe26d2dfb5a3365db58f1e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12500_2f15ed262b53ac563b1a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12500_396a64fac26f5b8b0584.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12500_396a64fac26f5b8b0584.png
new file mode 100644
index 0000000000000000000000000000000000000000..9aad6749c3c8740e3f34a11fab76a9d584e9930d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12500_396a64fac26f5b8b0584.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12500_6a4d43c4492d8a9a5e87.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12500_6a4d43c4492d8a9a5e87.png
new file mode 100644
index 0000000000000000000000000000000000000000..3fa0e62a9aa7ea5fb91d65c1db38d98e87c56295
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12500_6a4d43c4492d8a9a5e87.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12500_801e37e9baba5b362c2a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12500_801e37e9baba5b362c2a.png
new file mode 100644
index 0000000000000000000000000000000000000000..95a14ab5a0ff335bd61e4645615484516048e85a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12500_801e37e9baba5b362c2a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12700_53920beaaf5638d6c440.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12700_53920beaaf5638d6c440.png
new file mode 100644
index 0000000000000000000000000000000000000000..e06081852a082e29c6ee18479697a59ab4c39496
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12700_53920beaaf5638d6c440.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12700_63f32478dd40dd2655b7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12700_63f32478dd40dd2655b7.png
new file mode 100644
index 0000000000000000000000000000000000000000..f9228aa6176ff94de81a330db01cb79879d011d6
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12700_63f32478dd40dd2655b7.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12700_9a192390b8e3c38ae3a4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12700_9a192390b8e3c38ae3a4.png
new file mode 100644
index 0000000000000000000000000000000000000000..01bafd30afdf961bc959f4bcb01c73e2c242595d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12700_9a192390b8e3c38ae3a4.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12700_f4f83f24638f0a58f6d1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12700_f4f83f24638f0a58f6d1.png
new file mode 100644
index 0000000000000000000000000000000000000000..dd1983531caecbb43a2e09a3fd12d5a63dad71a3
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12700_f4f83f24638f0a58f6d1.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12900_1c84e69dc3fd87007cb6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12900_1c84e69dc3fd87007cb6.png
new file mode 100644
index 0000000000000000000000000000000000000000..8ff86c9157b75f934fcfd8fbe4b90eb5a533adf6
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12900_1c84e69dc3fd87007cb6.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1c84e69dc3fd87007cb6f9d9b2d2e1186f3126bac2bc1acfef4577b88b2ddeea
+size 122179
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12900_4e1ad11ef5af69496f7a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12900_4e1ad11ef5af69496f7a.png
new file mode 100644
index 0000000000000000000000000000000000000000..08d98d32e8123c2d629041ed8fe108a3427b39dd
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12900_4e1ad11ef5af69496f7a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12900_929c7adee777508ab264.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12900_929c7adee777508ab264.png
new file mode 100644
index 0000000000000000000000000000000000000000..fb7a88e9826987261a14ddb416626ed770ba533a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12900_929c7adee777508ab264.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12900_9687e6c1a88e55de66f0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12900_9687e6c1a88e55de66f0.png
new file mode 100644
index 0000000000000000000000000000000000000000..741d3b38125eeff2aa877ef2100b91ad2badbd8c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_12900_9687e6c1a88e55de66f0.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1300_1ffb7890385ad94a006c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1300_1ffb7890385ad94a006c.png
new file mode 100644
index 0000000000000000000000000000000000000000..879fb23015b8340c0c5a538afcf95db084b1faef
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1300_1ffb7890385ad94a006c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1300_66250bd5afcfacc416d1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1300_66250bd5afcfacc416d1.png
new file mode 100644
index 0000000000000000000000000000000000000000..5226be03039e6d256c4d7204143e783f78aab1de
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1300_66250bd5afcfacc416d1.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1300_97aeb7ced813c18f059a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1300_97aeb7ced813c18f059a.png
new file mode 100644
index 0000000000000000000000000000000000000000..f467b91cae38826a446ccacba7028f87145bd887
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1300_97aeb7ced813c18f059a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1300_9a1e0b6e0c1a69819957.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1300_9a1e0b6e0c1a69819957.png
new file mode 100644
index 0000000000000000000000000000000000000000..edb2beddee5e4cbbbb29cc3fa0825bfe8082bb2b
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1300_9a1e0b6e0c1a69819957.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13100_4b1bded3f581d9e43ae6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13100_4b1bded3f581d9e43ae6.png
new file mode 100644
index 0000000000000000000000000000000000000000..858fd60142ed53e0b25a0b9b05fd2a9e008fa000
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13100_4b1bded3f581d9e43ae6.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13100_4e3ffe17153e07083a9d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13100_4e3ffe17153e07083a9d.png
new file mode 100644
index 0000000000000000000000000000000000000000..28baf8f652a5aac140a520097b984d4e8350babc
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13100_4e3ffe17153e07083a9d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13100_dad39550acbc23dc7f32.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13100_dad39550acbc23dc7f32.png
new file mode 100644
index 0000000000000000000000000000000000000000..725fa5c39597a6ec4c090be75e93c09172c099a2
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13100_dad39550acbc23dc7f32.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13100_f33d3fe88275f0bbe660.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13100_f33d3fe88275f0bbe660.png
new file mode 100644
index 0000000000000000000000000000000000000000..eeaa63f4d66b16e9645e1041ee05acf7e8c3b38f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13100_f33d3fe88275f0bbe660.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13300_08244330712a5fd83991.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13300_08244330712a5fd83991.png
new file mode 100644
index 0000000000000000000000000000000000000000..1758407847472545d8eaef913bfdf4d6ff23efda
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13300_08244330712a5fd83991.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13300_51cc4950da13956e0aed.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13300_51cc4950da13956e0aed.png
new file mode 100644
index 0000000000000000000000000000000000000000..e6e617c16991435969e96f59a109185ab11dc983
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13300_51cc4950da13956e0aed.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:51cc4950da13956e0aedf0a06a8ce48619f8efb1bab8c0588875b4842107e074
+size 103028
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13300_e1373e6c044126c1b579.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13300_e1373e6c044126c1b579.png
new file mode 100644
index 0000000000000000000000000000000000000000..877f06a7eb42d085d36e27471d4f12caaa1fb441
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13300_e1373e6c044126c1b579.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13300_f17fc2068ee64a7693e1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13300_f17fc2068ee64a7693e1.png
new file mode 100644
index 0000000000000000000000000000000000000000..3c6474b378f35518546b64cba8020b9fd084956f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13300_f17fc2068ee64a7693e1.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13500_16a54d65e4ef96f3697d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13500_16a54d65e4ef96f3697d.png
new file mode 100644
index 0000000000000000000000000000000000000000..ebb34b6d5f3581067118eaf21fd46bb03d1e197f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13500_16a54d65e4ef96f3697d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13500_1e3c65b96b89aa55e0fe.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13500_1e3c65b96b89aa55e0fe.png
new file mode 100644
index 0000000000000000000000000000000000000000..3799652af3dd5969cef100bc5fd2a5c914f5f09b
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13500_1e3c65b96b89aa55e0fe.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13500_40baf57715a3e036f118.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13500_40baf57715a3e036f118.png
new file mode 100644
index 0000000000000000000000000000000000000000..de20641ca9692b384cf830a516e3a13ecfb69299
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13500_40baf57715a3e036f118.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13500_e1bac00d68d746acdcd2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13500_e1bac00d68d746acdcd2.png
new file mode 100644
index 0000000000000000000000000000000000000000..22b549a641538a9a8a4a1eaef75380a581f88043
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13500_e1bac00d68d746acdcd2.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13700_0500c0113ac326ec15e3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13700_0500c0113ac326ec15e3.png
new file mode 100644
index 0000000000000000000000000000000000000000..a4f7aa1339b833ef531730041952808655a9f5be
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13700_0500c0113ac326ec15e3.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13700_6dd8b3d5465ec5d02a38.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13700_6dd8b3d5465ec5d02a38.png
new file mode 100644
index 0000000000000000000000000000000000000000..83b471eadd7fee5b54a22bb4bc5432eec82cec27
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13700_6dd8b3d5465ec5d02a38.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13700_cab0e0c1fbac34ea7c76.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13700_cab0e0c1fbac34ea7c76.png
new file mode 100644
index 0000000000000000000000000000000000000000..064e51423d7c08d5e8a66617a01fa043c230e900
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13700_cab0e0c1fbac34ea7c76.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13700_d2d50431152e3b4a9cc5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13700_d2d50431152e3b4a9cc5.png
new file mode 100644
index 0000000000000000000000000000000000000000..520d01985bc42c465dcba713fff321715cd67b6f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13700_d2d50431152e3b4a9cc5.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13900_80320e865cf76cf829f5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13900_80320e865cf76cf829f5.png
new file mode 100644
index 0000000000000000000000000000000000000000..bdfac1a7da14c0f227b51d9ff15d6e2959135e50
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13900_80320e865cf76cf829f5.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13900_8fcf522535d0008eed0c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13900_8fcf522535d0008eed0c.png
new file mode 100644
index 0000000000000000000000000000000000000000..25a86628d46979d38ad1c172da764ba57d31176f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13900_8fcf522535d0008eed0c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13900_abb95136614cf7653252.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13900_abb95136614cf7653252.png
new file mode 100644
index 0000000000000000000000000000000000000000..757a0a2e305e26bb1e90f78d52e6b41d2eaebe1a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13900_abb95136614cf7653252.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13900_d69e2420ccfc4b3dbacb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13900_d69e2420ccfc4b3dbacb.png
new file mode 100644
index 0000000000000000000000000000000000000000..d339cf695368b4803bd7fa014ec7cf2bef99f5fd
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_13900_d69e2420ccfc4b3dbacb.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14100_6e82029fb2942a8bb2e0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14100_6e82029fb2942a8bb2e0.png
new file mode 100644
index 0000000000000000000000000000000000000000..0286481e3ab17f5ee471e32d6adac270468b565c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14100_6e82029fb2942a8bb2e0.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14100_a45e995975f606e6d300.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14100_a45e995975f606e6d300.png
new file mode 100644
index 0000000000000000000000000000000000000000..28a6b4fc6862f090edcebe7a4914fc1a4bba834f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14100_a45e995975f606e6d300.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14100_dea2eb2131c51cb4fc7c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14100_dea2eb2131c51cb4fc7c.png
new file mode 100644
index 0000000000000000000000000000000000000000..d660a3e4ed95669d64882897745e762c4728f74e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14100_dea2eb2131c51cb4fc7c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14100_fc9d9e45921023af6658.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14100_fc9d9e45921023af6658.png
new file mode 100644
index 0000000000000000000000000000000000000000..e2e56f5ff19d2ad7ae57905ea685a648a56b647f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14100_fc9d9e45921023af6658.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14300_d8f273a6f88c6380845e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14300_d8f273a6f88c6380845e.png
new file mode 100644
index 0000000000000000000000000000000000000000..f7ec358c44cc24825511e0adedc18475e0001aa7
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14300_d8f273a6f88c6380845e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14300_db7394a3654a92ed9533.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14300_db7394a3654a92ed9533.png
new file mode 100644
index 0000000000000000000000000000000000000000..a07e6f6184310313d2c23e73e88327145f2f84d9
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14300_db7394a3654a92ed9533.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14300_e11c155e166e18e76b94.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14300_e11c155e166e18e76b94.png
new file mode 100644
index 0000000000000000000000000000000000000000..638b9ac9ff806ec51ff9c85c7e0ac8196a09c621
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14300_e11c155e166e18e76b94.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14300_eeef917cd61cac886582.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14300_eeef917cd61cac886582.png
new file mode 100644
index 0000000000000000000000000000000000000000..f5d838a370954f4173b59f978950c981b8e73b1a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14300_eeef917cd61cac886582.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14500_3568952ed9ed53cf6e90.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14500_3568952ed9ed53cf6e90.png
new file mode 100644
index 0000000000000000000000000000000000000000..062d07d537fc5496c5394ce1828d1c76446ad21c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14500_3568952ed9ed53cf6e90.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14500_4f7f35fbec6dfc02fe09.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14500_4f7f35fbec6dfc02fe09.png
new file mode 100644
index 0000000000000000000000000000000000000000..bd261764dd28f5734fac1e283337855c3d31434d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14500_4f7f35fbec6dfc02fe09.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14500_9bcc77c7ebf2b18b89c8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14500_9bcc77c7ebf2b18b89c8.png
new file mode 100644
index 0000000000000000000000000000000000000000..8ec2b0840b7ebaa3a27e8bceb8027821aaac6cb7
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14500_9bcc77c7ebf2b18b89c8.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9bcc77c7ebf2b18b89c8db52a405d5d57d1c0596afc65c60845697255687ea3a
+size 131700
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14500_b2820e0b70134279a8db.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14500_b2820e0b70134279a8db.png
new file mode 100644
index 0000000000000000000000000000000000000000..505a50e6cb61fd63b2f25c2ff8a6c8b2001a6b4f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14500_b2820e0b70134279a8db.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14700_086da0a90aaa13b6775a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14700_086da0a90aaa13b6775a.png
new file mode 100644
index 0000000000000000000000000000000000000000..98e59982d12b03007945ffd680de42483fa88a71
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14700_086da0a90aaa13b6775a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14700_335dbc6a574f2718d5b9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14700_335dbc6a574f2718d5b9.png
new file mode 100644
index 0000000000000000000000000000000000000000..af1a651b19d0eefd1453769fddea3d7b39f9aba3
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14700_335dbc6a574f2718d5b9.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14700_c5b26e44b28f2cf5bb12.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14700_c5b26e44b28f2cf5bb12.png
new file mode 100644
index 0000000000000000000000000000000000000000..9f249d200b8d18a25b2da4097086856b0debdd4c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14700_c5b26e44b28f2cf5bb12.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14700_fe632856d3f775a66136.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14700_fe632856d3f775a66136.png
new file mode 100644
index 0000000000000000000000000000000000000000..927920e10ebc1eca7977373dc05fa5eaf8fcccf1
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14700_fe632856d3f775a66136.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14900_0018f0b75d766735c1a8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14900_0018f0b75d766735c1a8.png
new file mode 100644
index 0000000000000000000000000000000000000000..15ca5606b87f037473c7e8efbe9f17a542e67424
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14900_0018f0b75d766735c1a8.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14900_5e7e6921f3950c01fdfd.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14900_5e7e6921f3950c01fdfd.png
new file mode 100644
index 0000000000000000000000000000000000000000..ba3c11c1ecfdc200ca50e00bd3363d90226fc74d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14900_5e7e6921f3950c01fdfd.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14900_b02d482421f1d4be247d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14900_b02d482421f1d4be247d.png
new file mode 100644
index 0000000000000000000000000000000000000000..fe987d2ce4937cb7817e9f62ba83f03d814428f5
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14900_b02d482421f1d4be247d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14900_e284d10f990c28997d05.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14900_e284d10f990c28997d05.png
new file mode 100644
index 0000000000000000000000000000000000000000..bcca390ca4731dd8e9cab5a9cb2c3bd263a86494
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_14900_e284d10f990c28997d05.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1500_1ace4dc52846437a516b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1500_1ace4dc52846437a516b.png
new file mode 100644
index 0000000000000000000000000000000000000000..4a26ab185013e216a3b436059ff63dbc8902a2d1
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1500_1ace4dc52846437a516b.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1500_6a68cc5056dbccbab521.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1500_6a68cc5056dbccbab521.png
new file mode 100644
index 0000000000000000000000000000000000000000..b6b651e633149de4da2fe1212e4cd07dd3d06610
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1500_6a68cc5056dbccbab521.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1500_8b8d87a15f102a70b38b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1500_8b8d87a15f102a70b38b.png
new file mode 100644
index 0000000000000000000000000000000000000000..39dbcdcfd63bfa87cda3ba902dff1568cac885d7
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1500_8b8d87a15f102a70b38b.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1500_e5107abf97a1e0e3a7ff.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1500_e5107abf97a1e0e3a7ff.png
new file mode 100644
index 0000000000000000000000000000000000000000..cb5f513b1af8eee2eb0c237f8d5e355a5bb13930
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1500_e5107abf97a1e0e3a7ff.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15100_305ed87b0070f07db4ad.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15100_305ed87b0070f07db4ad.png
new file mode 100644
index 0000000000000000000000000000000000000000..bf1ebad7e067168fbf3388e42e02bebb7900d7db
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15100_305ed87b0070f07db4ad.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15100_3e32d305430c105105b6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15100_3e32d305430c105105b6.png
new file mode 100644
index 0000000000000000000000000000000000000000..728c4d2568ee68dae1edac5c9f6855b41c5955ef
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15100_3e32d305430c105105b6.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15100_795d44ef2ada5c786be9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15100_795d44ef2ada5c786be9.png
new file mode 100644
index 0000000000000000000000000000000000000000..4b258014a503b7e5e4f365f9ec5c5ee2600fbd7d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15100_795d44ef2ada5c786be9.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15100_bfa26dba42a4a422a471.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15100_bfa26dba42a4a422a471.png
new file mode 100644
index 0000000000000000000000000000000000000000..cf4426b2092fe38db4bcd019c6294d63b29f2231
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15100_bfa26dba42a4a422a471.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15300_2d18c0e5603173bf9942.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15300_2d18c0e5603173bf9942.png
new file mode 100644
index 0000000000000000000000000000000000000000..01cc9bb5a85d61b03ef80c62e39b98a53651b2b0
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15300_2d18c0e5603173bf9942.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15300_408e7dcfc9a59659ffe9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15300_408e7dcfc9a59659ffe9.png
new file mode 100644
index 0000000000000000000000000000000000000000..30989c9132a7ed2f2e956bc8893ab560d1bed2bd
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15300_408e7dcfc9a59659ffe9.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15300_cae2ddae2cafe4358299.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15300_cae2ddae2cafe4358299.png
new file mode 100644
index 0000000000000000000000000000000000000000..85ae63d28136e012f43aa74a4db39aa89f4b76d7
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15300_cae2ddae2cafe4358299.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15300_f8dc52f7ed55793ae365.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15300_f8dc52f7ed55793ae365.png
new file mode 100644
index 0000000000000000000000000000000000000000..7d0b92eb88fc29fa3b2eb35e3f2fa25049bd9a89
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15300_f8dc52f7ed55793ae365.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15500_2f9e0781ddc88a513049.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15500_2f9e0781ddc88a513049.png
new file mode 100644
index 0000000000000000000000000000000000000000..9a90b356d9082fbbb6025c4a74fbe6c00352dbe3
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15500_2f9e0781ddc88a513049.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15500_4e54e7cdac26c9d745b3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15500_4e54e7cdac26c9d745b3.png
new file mode 100644
index 0000000000000000000000000000000000000000..a5d7b35e28291612cd073433e2e3cdaee922cd66
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15500_4e54e7cdac26c9d745b3.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15500_bd5094ecdaca82c1f05f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15500_bd5094ecdaca82c1f05f.png
new file mode 100644
index 0000000000000000000000000000000000000000..c124028d201be4e27f66f4867b467398e8e0725b
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15500_bd5094ecdaca82c1f05f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15500_e4abf0d7644a66423c12.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15500_e4abf0d7644a66423c12.png
new file mode 100644
index 0000000000000000000000000000000000000000..7d68546c6aa3abdaaae00211414d54f5318bc07d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15500_e4abf0d7644a66423c12.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15700_3cde2beb6e6f78e1d559.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15700_3cde2beb6e6f78e1d559.png
new file mode 100644
index 0000000000000000000000000000000000000000..8aecd3f4cde0a206a216e1de6a81613675d7c26d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15700_3cde2beb6e6f78e1d559.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15700_43b84e43a07d74590dd6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15700_43b84e43a07d74590dd6.png
new file mode 100644
index 0000000000000000000000000000000000000000..f507ed9bc332ae8d59d209d402020440fd9361b9
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15700_43b84e43a07d74590dd6.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15700_5d733d592a9b103b45fc.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15700_5d733d592a9b103b45fc.png
new file mode 100644
index 0000000000000000000000000000000000000000..eac98b88db072f3bbb7ef8ab46aeb10030906ad4
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15700_5d733d592a9b103b45fc.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15700_74d990b9c7bacce600f8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15700_74d990b9c7bacce600f8.png
new file mode 100644
index 0000000000000000000000000000000000000000..a62c8b3ab87fcda9840d199e27fea00e9a27ded9
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15700_74d990b9c7bacce600f8.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15900_11731ed66a011231824f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15900_11731ed66a011231824f.png
new file mode 100644
index 0000000000000000000000000000000000000000..1040b5371f27a42f0662afcbdc102b8db4d1f381
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15900_11731ed66a011231824f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15900_75eae5dfebedacef0aae.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15900_75eae5dfebedacef0aae.png
new file mode 100644
index 0000000000000000000000000000000000000000..1ca0f62736eff3750ffdc613c6dc1c47403bafb5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15900_75eae5dfebedacef0aae.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:75eae5dfebedacef0aae7e91d88e443a97908d93e00f6eeef59ac4be760b98e6
+size 115549
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15900_76dce236fa671916ba54.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15900_76dce236fa671916ba54.png
new file mode 100644
index 0000000000000000000000000000000000000000..60b410a6a04db2b9791f85b163364d78f0ce4440
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15900_76dce236fa671916ba54.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15900_8dce498420bbec630ca7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15900_8dce498420bbec630ca7.png
new file mode 100644
index 0000000000000000000000000000000000000000..9f9e8ff568ecf1da5bdda9755b8d1354da5b5554
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_15900_8dce498420bbec630ca7.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16100_15e53027693a7228ff50.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16100_15e53027693a7228ff50.png
new file mode 100644
index 0000000000000000000000000000000000000000..66b63b9603bb09f0ed94c6e75f6f276850b7a9ab
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16100_15e53027693a7228ff50.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16100_7e121c0b8af162639426.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16100_7e121c0b8af162639426.png
new file mode 100644
index 0000000000000000000000000000000000000000..0bb6f2753b584e86111506730ee7ade711b1dcea
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16100_7e121c0b8af162639426.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16100_80524e29d1c94df5b0db.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16100_80524e29d1c94df5b0db.png
new file mode 100644
index 0000000000000000000000000000000000000000..3d343667ab01f2fa120bd3c0043696cca171082e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16100_80524e29d1c94df5b0db.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16100_b021c0595d76f0ce91c0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16100_b021c0595d76f0ce91c0.png
new file mode 100644
index 0000000000000000000000000000000000000000..813c42a73753b7afe7621d4d14c9e9a259def3d1
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16100_b021c0595d76f0ce91c0.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16300_6216f579f59cbcaa75e3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16300_6216f579f59cbcaa75e3.png
new file mode 100644
index 0000000000000000000000000000000000000000..5f5d13ef86a8132052751f7537db7bc3b8e3215d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16300_6216f579f59cbcaa75e3.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16300_62b05ffac6741f144570.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16300_62b05ffac6741f144570.png
new file mode 100644
index 0000000000000000000000000000000000000000..d6b76a2a4215d0b1db289936b1adbe2da473d226
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16300_62b05ffac6741f144570.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16300_846ebb291e8469b2c1ca.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16300_846ebb291e8469b2c1ca.png
new file mode 100644
index 0000000000000000000000000000000000000000..70f42b04af1a5cc69f0f27b83c428560227ac516
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16300_846ebb291e8469b2c1ca.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16300_ad7a84ab17a8ef3f6b00.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16300_ad7a84ab17a8ef3f6b00.png
new file mode 100644
index 0000000000000000000000000000000000000000..408708c0aec6688c51f28a7ec11664fa5cdb162c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16300_ad7a84ab17a8ef3f6b00.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16500_1a786aa85816c702bc0b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16500_1a786aa85816c702bc0b.png
new file mode 100644
index 0000000000000000000000000000000000000000..e27337f0c249c2f74b6a96f237afdbb5eb0b61d6
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16500_1a786aa85816c702bc0b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1a786aa85816c702bc0ba1a287613ccda63353cb0803ed62db37f35dd34cedd5
+size 120395
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16500_673186ac9d5589cc9012.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16500_673186ac9d5589cc9012.png
new file mode 100644
index 0000000000000000000000000000000000000000..aff1cff49b2d9cd303627565cbc020a068e33cbb
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16500_673186ac9d5589cc9012.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16500_c084d6515e52ca571dc2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16500_c084d6515e52ca571dc2.png
new file mode 100644
index 0000000000000000000000000000000000000000..7f9b54509e352f9a1b5bbebbb80a53c7c8b46244
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16500_c084d6515e52ca571dc2.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16500_e50b3aaf32a3991b3339.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16500_e50b3aaf32a3991b3339.png
new file mode 100644
index 0000000000000000000000000000000000000000..038b703fd2886ad0d30e591080595ddff747d591
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16500_e50b3aaf32a3991b3339.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16700_7f8ef94dba512c5d9e1c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16700_7f8ef94dba512c5d9e1c.png
new file mode 100644
index 0000000000000000000000000000000000000000..820f2555d172906dc9a2bb8df916af270cc88534
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16700_7f8ef94dba512c5d9e1c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16700_93e115deeac086c97374.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16700_93e115deeac086c97374.png
new file mode 100644
index 0000000000000000000000000000000000000000..fe9bbe38fee796ffb961e3009d83a11f6b347ad9
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16700_93e115deeac086c97374.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16700_a2f092c64c30dc549bd8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16700_a2f092c64c30dc549bd8.png
new file mode 100644
index 0000000000000000000000000000000000000000..a73f8188e2fde73072d04e13c59a97ab3f3cd562
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16700_a2f092c64c30dc549bd8.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16700_ecbeb3e9fc73a75c6e2e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16700_ecbeb3e9fc73a75c6e2e.png
new file mode 100644
index 0000000000000000000000000000000000000000..3222ddda9300f5dec35330d84173f1b05325a3a0
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16700_ecbeb3e9fc73a75c6e2e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16900_0792f81460ee745dc881.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16900_0792f81460ee745dc881.png
new file mode 100644
index 0000000000000000000000000000000000000000..d2866c76d317aabbfa558de6e9970ad4cc5c84df
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16900_0792f81460ee745dc881.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16900_953764805b262437051a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16900_953764805b262437051a.png
new file mode 100644
index 0000000000000000000000000000000000000000..1b6cc0db40e19b5bea07b1eb4653ce83480bfac4
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16900_953764805b262437051a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:953764805b262437051a60ed2fc98f57ad7b9f91500438efb6ba93d632334963
+size 135309
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16900_c5fea65912eed85dc46d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16900_c5fea65912eed85dc46d.png
new file mode 100644
index 0000000000000000000000000000000000000000..a1971f960414fbeb8e4c1146aa442edf815f80fd
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16900_c5fea65912eed85dc46d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16900_e59159a46ed7f624cfb1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16900_e59159a46ed7f624cfb1.png
new file mode 100644
index 0000000000000000000000000000000000000000..9c4da8dd70d41f7714abcba179ca61454c29b15c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_16900_e59159a46ed7f624cfb1.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1700_79e76f69792ebe71780f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1700_79e76f69792ebe71780f.png
new file mode 100644
index 0000000000000000000000000000000000000000..9d86546ed26913a64a4fef207af67d243a934fc0
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1700_79e76f69792ebe71780f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1700_93807144d99fa9bc0539.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1700_93807144d99fa9bc0539.png
new file mode 100644
index 0000000000000000000000000000000000000000..4e3dd51b92f855927e48007f22dfd9e41b652f04
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1700_93807144d99fa9bc0539.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1700_e4af6d24e366b6bd03a5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1700_e4af6d24e366b6bd03a5.png
new file mode 100644
index 0000000000000000000000000000000000000000..c536b7f2a8a249a6bbf270e22d1283c2c4a95a06
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1700_e4af6d24e366b6bd03a5.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1700_eee93be46a14482c81b0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1700_eee93be46a14482c81b0.png
new file mode 100644
index 0000000000000000000000000000000000000000..aca0cd6f9236a1a0d5a31243ceea28e57ff44168
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1700_eee93be46a14482c81b0.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17100_62f037f59d51f3f09a5e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17100_62f037f59d51f3f09a5e.png
new file mode 100644
index 0000000000000000000000000000000000000000..f9853cbfd5bf11b5d2ecf4c76c347eec0258cfed
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17100_62f037f59d51f3f09a5e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:62f037f59d51f3f09a5e2d202f5a2ef2feb11c96ec1374a341ad14387d89df42
+size 135229
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17100_69aa149c0eada6b5bb76.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17100_69aa149c0eada6b5bb76.png
new file mode 100644
index 0000000000000000000000000000000000000000..73667fa0e1c10bf9d69b544459f968dd6e686279
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17100_69aa149c0eada6b5bb76.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17100_a44be97193cccc295372.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17100_a44be97193cccc295372.png
new file mode 100644
index 0000000000000000000000000000000000000000..b9b1b2bb7bc411538f411232f0d957ad8ebc26bc
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17100_a44be97193cccc295372.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17100_d634d79b27ee280658a1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17100_d634d79b27ee280658a1.png
new file mode 100644
index 0000000000000000000000000000000000000000..2bc89ee053e1de2e52b54ecbe76ff02c2d0830ba
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17100_d634d79b27ee280658a1.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17300_b9bb0a6beffce6ef3fa5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17300_b9bb0a6beffce6ef3fa5.png
new file mode 100644
index 0000000000000000000000000000000000000000..09f182cb9ae016fdad42ed8d6bf2af4cc4f71363
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17300_b9bb0a6beffce6ef3fa5.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17300_cab5a787bea4e67036c5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17300_cab5a787bea4e67036c5.png
new file mode 100644
index 0000000000000000000000000000000000000000..52172d96d7495edf21b0c45533579a02455b4f2f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17300_cab5a787bea4e67036c5.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17300_d8790adf2efa2e06fe4c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17300_d8790adf2efa2e06fe4c.png
new file mode 100644
index 0000000000000000000000000000000000000000..98676dcb1745a7e7a711219173c1673985d7ebad
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17300_d8790adf2efa2e06fe4c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17300_f497483683783dc00910.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17300_f497483683783dc00910.png
new file mode 100644
index 0000000000000000000000000000000000000000..572293c69036efec7826b626d59dc7e6c7b135ed
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17300_f497483683783dc00910.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f497483683783dc0091099ef8b3d75cff3a539940ab6a2cdc73a2ef6975e4455
+size 111245
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17500_6aca5f9334f8223ec034.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17500_6aca5f9334f8223ec034.png
new file mode 100644
index 0000000000000000000000000000000000000000..d616a666e258ba07772dbb96a26c76b56640a3d1
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17500_6aca5f9334f8223ec034.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17500_bf8279e2afc3ea987e36.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17500_bf8279e2afc3ea987e36.png
new file mode 100644
index 0000000000000000000000000000000000000000..ca7c125b26ddd4526c5e78d6f1c9b680e4ed3021
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17500_bf8279e2afc3ea987e36.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17500_cdc917935eb46a32aa2d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17500_cdc917935eb46a32aa2d.png
new file mode 100644
index 0000000000000000000000000000000000000000..71c25750586ba0202e54bc6ef6c0cca73c30f105
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17500_cdc917935eb46a32aa2d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17500_d092e6174dc5d542a22d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17500_d092e6174dc5d542a22d.png
new file mode 100644
index 0000000000000000000000000000000000000000..d5e8b2376f84ee566b0fd034f1d2b75c4168df6f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17500_d092e6174dc5d542a22d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17700_59fb326d7445053d0d8d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17700_59fb326d7445053d0d8d.png
new file mode 100644
index 0000000000000000000000000000000000000000..c5a27a2c5c83a07c8c62c173167422f563bf7555
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17700_59fb326d7445053d0d8d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17700_71a69352dd02a420650e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17700_71a69352dd02a420650e.png
new file mode 100644
index 0000000000000000000000000000000000000000..68b5a43596c34ea3cb95aaedbd175a4584ed4c7b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17700_71a69352dd02a420650e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:71a69352dd02a420650e06f0024ecd9efacc7feec2d859b60eae48c17f8009cb
+size 115746
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17700_ae80d2f3cf928d0ba6af.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17700_ae80d2f3cf928d0ba6af.png
new file mode 100644
index 0000000000000000000000000000000000000000..d7c4d7a053fa58a2eb179ba6b5ec716bb1e0fb45
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17700_ae80d2f3cf928d0ba6af.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17700_c4da9f94b9310ac6f73e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17700_c4da9f94b9310ac6f73e.png
new file mode 100644
index 0000000000000000000000000000000000000000..10b7a63afb31898c8497f318b1597032eeaf40ab
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17700_c4da9f94b9310ac6f73e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17900_0cd42d717f65c75dea08.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17900_0cd42d717f65c75dea08.png
new file mode 100644
index 0000000000000000000000000000000000000000..9c6cfa1e47ec3688526bd5c8e160fd3315831898
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17900_0cd42d717f65c75dea08.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17900_10465aaa869c74f7d37e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17900_10465aaa869c74f7d37e.png
new file mode 100644
index 0000000000000000000000000000000000000000..7353ed7b5964129694f57d1d9ba1cb4ef0b3f722
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17900_10465aaa869c74f7d37e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17900_2bb73801facad52e4c93.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17900_2bb73801facad52e4c93.png
new file mode 100644
index 0000000000000000000000000000000000000000..1aeb26858c7e3a1fe38ce65de5399ad3221be8ac
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17900_2bb73801facad52e4c93.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17900_56b621913bf734688c74.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17900_56b621913bf734688c74.png
new file mode 100644
index 0000000000000000000000000000000000000000..db24b43c4ff0d0dc4e6c315cf7c719452a513228
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_17900_56b621913bf734688c74.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18100_24b29175a47be53f453f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18100_24b29175a47be53f453f.png
new file mode 100644
index 0000000000000000000000000000000000000000..e41dd2d7a51b32a1938df2177983014e2e3ed1eb
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18100_24b29175a47be53f453f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18100_ccc1b9b7dc83774db831.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18100_ccc1b9b7dc83774db831.png
new file mode 100644
index 0000000000000000000000000000000000000000..6af3c19c2534630627e439ae841e9104326c8e06
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18100_ccc1b9b7dc83774db831.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18100_d8c7f26e2ed2a8b788b8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18100_d8c7f26e2ed2a8b788b8.png
new file mode 100644
index 0000000000000000000000000000000000000000..6b077c04507f541947175aeeb7a11b0784345f23
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18100_d8c7f26e2ed2a8b788b8.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18100_e33a40df421f8f31d057.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18100_e33a40df421f8f31d057.png
new file mode 100644
index 0000000000000000000000000000000000000000..b0b5eb7d1c72f1dabbf9dd2659224d5263da86a4
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18100_e33a40df421f8f31d057.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18300_21e9b56269e22202f0cd.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18300_21e9b56269e22202f0cd.png
new file mode 100644
index 0000000000000000000000000000000000000000..3f640d960cd3f90aad779327f5b46dfb6bee5a30
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18300_21e9b56269e22202f0cd.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18300_57eafe97ac7133819edf.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18300_57eafe97ac7133819edf.png
new file mode 100644
index 0000000000000000000000000000000000000000..99f73043d1a5fa084f8b161d452c0cb84b36a5d6
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18300_57eafe97ac7133819edf.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18300_a98fc7c492468245a8ec.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18300_a98fc7c492468245a8ec.png
new file mode 100644
index 0000000000000000000000000000000000000000..4e2a5963dabfa33e850f1b948295d987f2f64d4b
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18300_a98fc7c492468245a8ec.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18300_eeae23ac59290341443c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18300_eeae23ac59290341443c.png
new file mode 100644
index 0000000000000000000000000000000000000000..09b21afe9745a6b2595239cca884d494e8fa9fa0
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18300_eeae23ac59290341443c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18500_705346f20351266fd4e1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18500_705346f20351266fd4e1.png
new file mode 100644
index 0000000000000000000000000000000000000000..3cfb477999b524af34d925b5bf9e7b0afd41eedd
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18500_705346f20351266fd4e1.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18500_74a135d34944fa80c499.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18500_74a135d34944fa80c499.png
new file mode 100644
index 0000000000000000000000000000000000000000..34e2892e634a33078bc16ef41120edd0342b8d38
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18500_74a135d34944fa80c499.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:74a135d34944fa80c4999bffe524ee4810bd7a94e67f99d72a6132fb816a67df
+size 112671
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18500_db84f0a949bf1f8547e0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18500_db84f0a949bf1f8547e0.png
new file mode 100644
index 0000000000000000000000000000000000000000..2e81f316cd9f3d819a8bea2e70a49b93860d5601
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18500_db84f0a949bf1f8547e0.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18500_ebca2b70205400298af2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18500_ebca2b70205400298af2.png
new file mode 100644
index 0000000000000000000000000000000000000000..36c68a66f4c72757166a10d79b55299af0d6c4de
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18500_ebca2b70205400298af2.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18700_250cfb9dd44f7c64d33e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18700_250cfb9dd44f7c64d33e.png
new file mode 100644
index 0000000000000000000000000000000000000000..1ef34246842b76fe50dbb68034bf3055c228433a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18700_250cfb9dd44f7c64d33e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18700_4d6917fd4e898674cbd0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18700_4d6917fd4e898674cbd0.png
new file mode 100644
index 0000000000000000000000000000000000000000..7733cf7f1537dce52ee5f958b0d4ed2c7fd261e8
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18700_4d6917fd4e898674cbd0.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18700_94499137ea7919a96a00.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18700_94499137ea7919a96a00.png
new file mode 100644
index 0000000000000000000000000000000000000000..562417cc8cc327be56863331aba932795f4f7062
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18700_94499137ea7919a96a00.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18700_bd25f170f09b3ad63785.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18700_bd25f170f09b3ad63785.png
new file mode 100644
index 0000000000000000000000000000000000000000..597934b388c5d61724504e88fcc30a16663083c0
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18700_bd25f170f09b3ad63785.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18900_05e66fc9bdbc43d6565a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18900_05e66fc9bdbc43d6565a.png
new file mode 100644
index 0000000000000000000000000000000000000000..10defbb140f64e88dd642d0dad343ffd1b5072af
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18900_05e66fc9bdbc43d6565a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18900_26429e01f1c0cbc4bc38.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18900_26429e01f1c0cbc4bc38.png
new file mode 100644
index 0000000000000000000000000000000000000000..d62dd7189fdffef4d3980c6382b91ceb162e5fd5
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18900_26429e01f1c0cbc4bc38.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18900_e01b1bd2e800f4d0ca98.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18900_e01b1bd2e800f4d0ca98.png
new file mode 100644
index 0000000000000000000000000000000000000000..bc3c5ac397c2fecb9e4a97db0ed6c5282ef2141e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18900_e01b1bd2e800f4d0ca98.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18900_ed91d218ec8ec417a429.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18900_ed91d218ec8ec417a429.png
new file mode 100644
index 0000000000000000000000000000000000000000..a1a5fa76fc9b06f191bc94aff6e6c1e83dd8d3c1
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_18900_ed91d218ec8ec417a429.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1900_34e790947d2af7bb368e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1900_34e790947d2af7bb368e.png
new file mode 100644
index 0000000000000000000000000000000000000000..c46cf48bde1ad5d1d6870957cc8fe7a394cf55c3
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1900_34e790947d2af7bb368e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1900_4f707b5d5bdc1d03cbed.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1900_4f707b5d5bdc1d03cbed.png
new file mode 100644
index 0000000000000000000000000000000000000000..743bf87121aa8fd91ca8fd77fe27d8e3a60a1bc8
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1900_4f707b5d5bdc1d03cbed.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1900_5610f618d3421371576e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1900_5610f618d3421371576e.png
new file mode 100644
index 0000000000000000000000000000000000000000..f261ad3a8b2d8bd9a59f81487a21f3a81e2ba0c3
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1900_5610f618d3421371576e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1900_e70fba042d3fdfc27633.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1900_e70fba042d3fdfc27633.png
new file mode 100644
index 0000000000000000000000000000000000000000..2b38e81a895092ae759acd90e3bd04364b0feb76
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_1900_e70fba042d3fdfc27633.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19100_2e89ce9602189d78d99d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19100_2e89ce9602189d78d99d.png
new file mode 100644
index 0000000000000000000000000000000000000000..33805528d057344b5c24bc3de9059ec68ed8b78a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19100_2e89ce9602189d78d99d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19100_74b6942311fa11bc0286.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19100_74b6942311fa11bc0286.png
new file mode 100644
index 0000000000000000000000000000000000000000..09587fa622b94a895fa96d115b64abdef5897ae8
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19100_74b6942311fa11bc0286.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:74b6942311fa11bc02862cc5898832d85fac60b0b43081f0ed472fbb74de4919
+size 103833
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19100_ae97094b3fae14567030.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19100_ae97094b3fae14567030.png
new file mode 100644
index 0000000000000000000000000000000000000000..2fd45d2b88dd0eb8f51705ee042592fa8ee136ce
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19100_ae97094b3fae14567030.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19100_f93d9ce9f7fe2d3a1994.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19100_f93d9ce9f7fe2d3a1994.png
new file mode 100644
index 0000000000000000000000000000000000000000..95be586aa7006e8ef10c2282576ee3baee8473e9
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19100_f93d9ce9f7fe2d3a1994.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19300_05bfc04b088e592d6887.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19300_05bfc04b088e592d6887.png
new file mode 100644
index 0000000000000000000000000000000000000000..b646e96516f4edb0386d055d5701dc7fcfece565
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19300_05bfc04b088e592d6887.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19300_4268c16460f8c26128cd.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19300_4268c16460f8c26128cd.png
new file mode 100644
index 0000000000000000000000000000000000000000..63887c7e122f945d8aadb1ae0e8d22358a0a9681
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19300_4268c16460f8c26128cd.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19300_907c18fb808634f1935a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19300_907c18fb808634f1935a.png
new file mode 100644
index 0000000000000000000000000000000000000000..07accb692dc1b8260a5dbc2f8156d5799c1e0b00
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19300_907c18fb808634f1935a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19300_d4d167553a7f9582d321.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19300_d4d167553a7f9582d321.png
new file mode 100644
index 0000000000000000000000000000000000000000..d3fd4c711a3af34663cb962462e633506dfa3101
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19300_d4d167553a7f9582d321.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19500_1035d82a548340bed7b8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19500_1035d82a548340bed7b8.png
new file mode 100644
index 0000000000000000000000000000000000000000..3e69a4b60c8e573f2762a03fb1e6ea6c2ef5a2c7
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19500_1035d82a548340bed7b8.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19500_67f04cdcfc5553b0ac7c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19500_67f04cdcfc5553b0ac7c.png
new file mode 100644
index 0000000000000000000000000000000000000000..ffcb4f68109be375b88a1ed781e67069e9fa13b9
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19500_67f04cdcfc5553b0ac7c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19500_bb9a3a442f7785ddbf62.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19500_bb9a3a442f7785ddbf62.png
new file mode 100644
index 0000000000000000000000000000000000000000..7661d3c3d8063345deda5df4d74a1c9cb4d3a160
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19500_bb9a3a442f7785ddbf62.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19500_fa8bf9d836bd17453ee3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19500_fa8bf9d836bd17453ee3.png
new file mode 100644
index 0000000000000000000000000000000000000000..7d084c8997cb1b5662b9ab3353b53eb495dd0007
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19500_fa8bf9d836bd17453ee3.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19700_51d4d5c834aef914f819.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19700_51d4d5c834aef914f819.png
new file mode 100644
index 0000000000000000000000000000000000000000..a5ced0253c40e76d66c2d4cfd4a771167a686f82
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19700_51d4d5c834aef914f819.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19700_6987d4d0fe69143830df.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19700_6987d4d0fe69143830df.png
new file mode 100644
index 0000000000000000000000000000000000000000..1db52b8bff17bedffa796842e2ccf87bfcb98c88
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19700_6987d4d0fe69143830df.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19700_eabfce160548ef80a034.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19700_eabfce160548ef80a034.png
new file mode 100644
index 0000000000000000000000000000000000000000..1c9774577e54cdcb9bf0290fc29fe3d32bb33af1
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19700_eabfce160548ef80a034.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19700_ed192468416b554e06ad.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19700_ed192468416b554e06ad.png
new file mode 100644
index 0000000000000000000000000000000000000000..051268f3b2541411aa8636f1f35f870eac5583e3
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19700_ed192468416b554e06ad.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19900_53889793845ff052439b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19900_53889793845ff052439b.png
new file mode 100644
index 0000000000000000000000000000000000000000..aa2446a115e87aaf39622e0bd05be0400f922f3a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19900_53889793845ff052439b.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19900_67732b8d705a9d6ce672.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19900_67732b8d705a9d6ce672.png
new file mode 100644
index 0000000000000000000000000000000000000000..ac6d6f48a1be25daed5768d3ad201aa5d4a6c913
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19900_67732b8d705a9d6ce672.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19900_869b8dbb1fbc5103bff8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19900_869b8dbb1fbc5103bff8.png
new file mode 100644
index 0000000000000000000000000000000000000000..ae2e58e953b836b41f69c4544576f1c829ee120d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19900_869b8dbb1fbc5103bff8.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19900_ea9d28428d208acaeda5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19900_ea9d28428d208acaeda5.png
new file mode 100644
index 0000000000000000000000000000000000000000..01a6030b430225c85eec495432b7a3fa27dff57d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_19900_ea9d28428d208acaeda5.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20100_321bddafae5313a296a8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20100_321bddafae5313a296a8.png
new file mode 100644
index 0000000000000000000000000000000000000000..2cf461ad81075f01e2f9803a9ca7b6c15867cf5e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20100_321bddafae5313a296a8.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20100_ac86a68d4b853bf33da4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20100_ac86a68d4b853bf33da4.png
new file mode 100644
index 0000000000000000000000000000000000000000..582849ceddba070a8864ec0460fbcf0b41a33b7f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20100_ac86a68d4b853bf33da4.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20100_f1d4aadc70f26d262477.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20100_f1d4aadc70f26d262477.png
new file mode 100644
index 0000000000000000000000000000000000000000..8b769b7142a477e71ae1fc2e0f9d02ec40b90165
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20100_f1d4aadc70f26d262477.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20100_f4159f7458809c0fc399.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20100_f4159f7458809c0fc399.png
new file mode 100644
index 0000000000000000000000000000000000000000..cacb8a3bfbfc9ef3015228967ed3853fa3a2c958
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20100_f4159f7458809c0fc399.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20300_1e0f26f62357a6606ee5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20300_1e0f26f62357a6606ee5.png
new file mode 100644
index 0000000000000000000000000000000000000000..e57c21ee7bf11b206de79c205ee5e3c9fb567e1f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20300_1e0f26f62357a6606ee5.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20300_88bdb28c9cc8f78f7123.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20300_88bdb28c9cc8f78f7123.png
new file mode 100644
index 0000000000000000000000000000000000000000..53f68fd5067a0bbd56754631c1fd7ecfdf1afabb
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20300_88bdb28c9cc8f78f7123.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20300_a3ea39f56cc26ed427ef.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20300_a3ea39f56cc26ed427ef.png
new file mode 100644
index 0000000000000000000000000000000000000000..aef99e0de8162b4f46a87041f470f2d4f09077b6
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20300_a3ea39f56cc26ed427ef.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20300_d2f8e2f72a915c2a24df.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20300_d2f8e2f72a915c2a24df.png
new file mode 100644
index 0000000000000000000000000000000000000000..c241010c84604bbe1c162776610dc4c9478719cd
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20300_d2f8e2f72a915c2a24df.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20500_175f2d590001c9261b07.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20500_175f2d590001c9261b07.png
new file mode 100644
index 0000000000000000000000000000000000000000..73a93abc78588d8ed00ac8b8c25208b0cbba6eec
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20500_175f2d590001c9261b07.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20500_4fb4d704f7608d5347d7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20500_4fb4d704f7608d5347d7.png
new file mode 100644
index 0000000000000000000000000000000000000000..f79270df8766b4deea252a6610949ea28f657fc7
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20500_4fb4d704f7608d5347d7.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20500_81433ecbb816778502e2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20500_81433ecbb816778502e2.png
new file mode 100644
index 0000000000000000000000000000000000000000..f1b486dcd1f5a2d81c874413c6fcda0a5f36115a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20500_81433ecbb816778502e2.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20500_d4dc40166b11e0cd9c2e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20500_d4dc40166b11e0cd9c2e.png
new file mode 100644
index 0000000000000000000000000000000000000000..28321097b88825f2e6d1d5e3734196387a5a10f0
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20500_d4dc40166b11e0cd9c2e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20700_b46f734c8cad3d1267bd.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20700_b46f734c8cad3d1267bd.png
new file mode 100644
index 0000000000000000000000000000000000000000..79dceec3c9d0e932765742553a6c514429195a6f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20700_b46f734c8cad3d1267bd.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20700_d3e02edb04a859231c66.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20700_d3e02edb04a859231c66.png
new file mode 100644
index 0000000000000000000000000000000000000000..aeddf394cba3ccedf926af6e35dce2db896a725d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20700_d3e02edb04a859231c66.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20700_d7b7a0242fc3c8623186.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20700_d7b7a0242fc3c8623186.png
new file mode 100644
index 0000000000000000000000000000000000000000..a19a068df13f374d70c89abebbd97fbe8490b0cb
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20700_d7b7a0242fc3c8623186.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20700_f159df2648e24a25f66e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20700_f159df2648e24a25f66e.png
new file mode 100644
index 0000000000000000000000000000000000000000..ad0b02cfbae3c3d35599c9ca76380293a9ec0c11
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20700_f159df2648e24a25f66e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20900_2814ba0b46b29b11a776.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20900_2814ba0b46b29b11a776.png
new file mode 100644
index 0000000000000000000000000000000000000000..7c9b71b2d2898f5d3aa68c7fe87a361720aa28a8
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20900_2814ba0b46b29b11a776.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20900_3b66af640ca7b444df7e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20900_3b66af640ca7b444df7e.png
new file mode 100644
index 0000000000000000000000000000000000000000..3f442ec6c361fb413cd9b7ab4ea00629c994b2a8
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20900_3b66af640ca7b444df7e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20900_83ea771dc8bfcf993e79.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20900_83ea771dc8bfcf993e79.png
new file mode 100644
index 0000000000000000000000000000000000000000..c18856294c6d68d3016d2fbce480718d7b9f57db
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20900_83ea771dc8bfcf993e79.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20900_ccb2f669b891afbc68ca.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20900_ccb2f669b891afbc68ca.png
new file mode 100644
index 0000000000000000000000000000000000000000..c6f8ceea8030d0d246e33bf83f5207e63e7d7086
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_20900_ccb2f669b891afbc68ca.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2100_05d8fd8cf7405789eada.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2100_05d8fd8cf7405789eada.png
new file mode 100644
index 0000000000000000000000000000000000000000..1d43841ee0bd88bcb8e556f4d578d62c72b6a794
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2100_05d8fd8cf7405789eada.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2100_2c57860d967d38ddf7c0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2100_2c57860d967d38ddf7c0.png
new file mode 100644
index 0000000000000000000000000000000000000000..b22411b809820a540f65d1e751001a7276e8eb12
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2100_2c57860d967d38ddf7c0.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2100_38643b89f8c1bb09d7a1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2100_38643b89f8c1bb09d7a1.png
new file mode 100644
index 0000000000000000000000000000000000000000..913541eb28bcace0c51cfac9a108054099ca9b5d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2100_38643b89f8c1bb09d7a1.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2100_a84c017afcf803f8e8eb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2100_a84c017afcf803f8e8eb.png
new file mode 100644
index 0000000000000000000000000000000000000000..02a92aecbd0cfdcd869a99eb1d0a2d8ecd55a77e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2100_a84c017afcf803f8e8eb.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21100_aa047dc0291fddc1a8a6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21100_aa047dc0291fddc1a8a6.png
new file mode 100644
index 0000000000000000000000000000000000000000..cc393e041a9edee11b95747b7157cad4a6dea6e7
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21100_aa047dc0291fddc1a8a6.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21100_b1d5ae8caa8ada4d3c1e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21100_b1d5ae8caa8ada4d3c1e.png
new file mode 100644
index 0000000000000000000000000000000000000000..aa74ff60577a43a7e16c893e2ac1132c74e78109
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21100_b1d5ae8caa8ada4d3c1e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21100_dac58c23136c982a1197.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21100_dac58c23136c982a1197.png
new file mode 100644
index 0000000000000000000000000000000000000000..78bcc50ecd388858d363eb3e880eb0273787c438
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21100_dac58c23136c982a1197.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21100_dae9d8a4f7d3e73b44bd.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21100_dae9d8a4f7d3e73b44bd.png
new file mode 100644
index 0000000000000000000000000000000000000000..a30d1e8dd12339b8810ac1527ce3d73ebdf82c40
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21100_dae9d8a4f7d3e73b44bd.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21300_42d6c7bf8c6c18b59f05.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21300_42d6c7bf8c6c18b59f05.png
new file mode 100644
index 0000000000000000000000000000000000000000..5e0ede34a15c5f29d09b6bd9203a1b37ec3b9c47
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21300_42d6c7bf8c6c18b59f05.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21300_7e10324c401c8d57f88a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21300_7e10324c401c8d57f88a.png
new file mode 100644
index 0000000000000000000000000000000000000000..760c43af0b3612b8a927e0371e86960dbd93405b
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21300_7e10324c401c8d57f88a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21300_aada3db99c22214da4f6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21300_aada3db99c22214da4f6.png
new file mode 100644
index 0000000000000000000000000000000000000000..1e4763fc298765842b2a1561360330c8633940ea
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21300_aada3db99c22214da4f6.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21300_e50b6633788381e61c66.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21300_e50b6633788381e61c66.png
new file mode 100644
index 0000000000000000000000000000000000000000..c514686ca1a6bb967494c972675409deef11960a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21300_e50b6633788381e61c66.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21500_3319d341c8c91d323684.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21500_3319d341c8c91d323684.png
new file mode 100644
index 0000000000000000000000000000000000000000..bb86f6220e3b13777f99944bdde85d11421d7b62
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21500_3319d341c8c91d323684.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21500_3aeabf6ddc1ee619a5fb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21500_3aeabf6ddc1ee619a5fb.png
new file mode 100644
index 0000000000000000000000000000000000000000..aca002aac9d9f8aa32444c37e07ab80ea0b2e033
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21500_3aeabf6ddc1ee619a5fb.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21500_71b8cc4af1eef03f55f2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21500_71b8cc4af1eef03f55f2.png
new file mode 100644
index 0000000000000000000000000000000000000000..9c4b582f552b4a787f9a113635a2fdc101c7d574
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21500_71b8cc4af1eef03f55f2.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21500_967083a3ffb5d839f2c4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21500_967083a3ffb5d839f2c4.png
new file mode 100644
index 0000000000000000000000000000000000000000..c7958b1d81d1dfdfa4a89eb24d7825f355bae457
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21500_967083a3ffb5d839f2c4.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21700_6ac324efcbdce1c56147.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21700_6ac324efcbdce1c56147.png
new file mode 100644
index 0000000000000000000000000000000000000000..ca943d6f4223913511f4a312a61af24f325f8898
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21700_6ac324efcbdce1c56147.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21700_7bf208191bf32c489643.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21700_7bf208191bf32c489643.png
new file mode 100644
index 0000000000000000000000000000000000000000..2a8f1874e4b866d837469f05e0c7e52854f72b5e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21700_7bf208191bf32c489643.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21700_b52ca877813b2d20d743.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21700_b52ca877813b2d20d743.png
new file mode 100644
index 0000000000000000000000000000000000000000..b0eb294c1b4ef5cfe6f70a165f50f22b5a436090
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21700_b52ca877813b2d20d743.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21700_d24804d3633e3c987c5c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21700_d24804d3633e3c987c5c.png
new file mode 100644
index 0000000000000000000000000000000000000000..14c7c7a206924274b1cc0779cc69b6ec3eecd6d1
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21700_d24804d3633e3c987c5c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21900_1fac4558069ef26c8f28.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21900_1fac4558069ef26c8f28.png
new file mode 100644
index 0000000000000000000000000000000000000000..50f2fd7309427d3d7b745eb76fd9954f19957d74
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21900_1fac4558069ef26c8f28.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21900_8bd1f08588dcc01a3f39.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21900_8bd1f08588dcc01a3f39.png
new file mode 100644
index 0000000000000000000000000000000000000000..cbb6fb8c2e902c3976d845ff2232d905ec92e529
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21900_8bd1f08588dcc01a3f39.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8bd1f08588dcc01a3f39140a6fc50c3216bf2d0550aa3c631869303b88a6eaa5
+size 139087
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21900_a55b5bb785b183f35d60.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21900_a55b5bb785b183f35d60.png
new file mode 100644
index 0000000000000000000000000000000000000000..5ac0ada06ac083dd6d61cff32384eed2b065fcd7
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21900_a55b5bb785b183f35d60.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21900_baae2fa41cc3233f23ad.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21900_baae2fa41cc3233f23ad.png
new file mode 100644
index 0000000000000000000000000000000000000000..7593a5d7f3b058457cd605fdf2b5c4c16bbe1909
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_21900_baae2fa41cc3233f23ad.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22100_2bbb9e8f5ccec42d3240.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22100_2bbb9e8f5ccec42d3240.png
new file mode 100644
index 0000000000000000000000000000000000000000..175d4a13082d060f360763d71902c859ccc63666
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22100_2bbb9e8f5ccec42d3240.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22100_382aa176432db10bb8ed.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22100_382aa176432db10bb8ed.png
new file mode 100644
index 0000000000000000000000000000000000000000..cd7545a4e98b28be5ea4d0fe74802c2b1c4b1e5b
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22100_382aa176432db10bb8ed.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22100_8533ae2243764582637d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22100_8533ae2243764582637d.png
new file mode 100644
index 0000000000000000000000000000000000000000..831a8fb9bedae36c4089d45e6268f5252367cf42
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22100_8533ae2243764582637d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22100_ab5e7c47b646fb30f8de.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22100_ab5e7c47b646fb30f8de.png
new file mode 100644
index 0000000000000000000000000000000000000000..362000436eb93439f803b79540fcbfb700064479
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22100_ab5e7c47b646fb30f8de.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22300_3942706cf230f15ac1b7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22300_3942706cf230f15ac1b7.png
new file mode 100644
index 0000000000000000000000000000000000000000..efbf8fbeeec657d255b538bc2acbb1a080e85eb9
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22300_3942706cf230f15ac1b7.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22300_453f8a695c735dbbb2ad.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22300_453f8a695c735dbbb2ad.png
new file mode 100644
index 0000000000000000000000000000000000000000..530c95a927c090624fe6b33b1cc1e241af8061b7
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22300_453f8a695c735dbbb2ad.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22300_bf58b27d5d7c0e733fde.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22300_bf58b27d5d7c0e733fde.png
new file mode 100644
index 0000000000000000000000000000000000000000..01a766e01a7e03317e6793811836934920573521
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22300_bf58b27d5d7c0e733fde.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22300_f69392ba3a8cdf636fde.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22300_f69392ba3a8cdf636fde.png
new file mode 100644
index 0000000000000000000000000000000000000000..70856384f6b798bc7a2bf5f752ac57b985b71719
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22300_f69392ba3a8cdf636fde.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22500_80cc95971bcea97b07e0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22500_80cc95971bcea97b07e0.png
new file mode 100644
index 0000000000000000000000000000000000000000..f396cd8bb117928e54f8e41b1e6695e42117021b
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22500_80cc95971bcea97b07e0.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22500_a47838bb9931af4a9ea9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22500_a47838bb9931af4a9ea9.png
new file mode 100644
index 0000000000000000000000000000000000000000..32e924b1e7045c4f41d1bcf2dc5f54fc1a86efea
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22500_a47838bb9931af4a9ea9.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22500_c516fda2f092cf4bc333.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22500_c516fda2f092cf4bc333.png
new file mode 100644
index 0000000000000000000000000000000000000000..260160502fa433dd4359ca63b92f5c284be9b4f5
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22500_c516fda2f092cf4bc333.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22500_f3682aff4301d6b9a6f1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22500_f3682aff4301d6b9a6f1.png
new file mode 100644
index 0000000000000000000000000000000000000000..c577ec9d6166f6ea27b746af0bda59c061edc57d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22500_f3682aff4301d6b9a6f1.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22700_41e94cf5b2e3c2dbd774.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22700_41e94cf5b2e3c2dbd774.png
new file mode 100644
index 0000000000000000000000000000000000000000..71863ef34d1f6f6e2632b52aa3be5d20e7e43154
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22700_41e94cf5b2e3c2dbd774.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22700_42fbe78031549e5cbcb4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22700_42fbe78031549e5cbcb4.png
new file mode 100644
index 0000000000000000000000000000000000000000..3c7222a1feabe60530c9310725a296ddf9404d59
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22700_42fbe78031549e5cbcb4.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22700_45f2af943bd4079285c5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22700_45f2af943bd4079285c5.png
new file mode 100644
index 0000000000000000000000000000000000000000..40935470245d5f64588cd73f9aaec1e799e51383
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22700_45f2af943bd4079285c5.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22700_4c855f06ace0ea118221.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22700_4c855f06ace0ea118221.png
new file mode 100644
index 0000000000000000000000000000000000000000..6e4c93c6ef97ea0d09cfa30d46fa8f573dac514b
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22700_4c855f06ace0ea118221.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22900_75e9f626c79ff0de8f10.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22900_75e9f626c79ff0de8f10.png
new file mode 100644
index 0000000000000000000000000000000000000000..6ae95a886d8dc584f104302573da8f82f70795db
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22900_75e9f626c79ff0de8f10.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22900_a165a01192ecccd465a4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22900_a165a01192ecccd465a4.png
new file mode 100644
index 0000000000000000000000000000000000000000..82d6f9f2f87829d0541bfad9fa9c86d9c340a76e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22900_a165a01192ecccd465a4.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22900_b95c633d219ba620009c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22900_b95c633d219ba620009c.png
new file mode 100644
index 0000000000000000000000000000000000000000..94e5073b65148c1de34b8e66ae70c4219222f0ee
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22900_b95c633d219ba620009c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22900_e796f63f7a74b3f8b13b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22900_e796f63f7a74b3f8b13b.png
new file mode 100644
index 0000000000000000000000000000000000000000..c8c6d583d828772da4f0bdfbe46a99d3a82c3737
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_22900_e796f63f7a74b3f8b13b.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2300_1a78ac2007189a350f47.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2300_1a78ac2007189a350f47.png
new file mode 100644
index 0000000000000000000000000000000000000000..0eddfac1eba5913b9013c4131819a4c063986b39
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2300_1a78ac2007189a350f47.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2300_1b7732d72ce10a0ec587.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2300_1b7732d72ce10a0ec587.png
new file mode 100644
index 0000000000000000000000000000000000000000..69402967c5f840baf322641134d966b83f401754
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2300_1b7732d72ce10a0ec587.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2300_acc26ef1bce61147ad95.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2300_acc26ef1bce61147ad95.png
new file mode 100644
index 0000000000000000000000000000000000000000..fb310006c1c2d980116ff9e3a76f645d1158c35e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2300_acc26ef1bce61147ad95.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2300_b93b3fd975b8c7da9fb2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2300_b93b3fd975b8c7da9fb2.png
new file mode 100644
index 0000000000000000000000000000000000000000..df468cea40254f4db94e30591ef4afea36b136b9
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2300_b93b3fd975b8c7da9fb2.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23100_0857b21c732a7ed4993a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23100_0857b21c732a7ed4993a.png
new file mode 100644
index 0000000000000000000000000000000000000000..6d4879495aa9a93fe926f78f7b32dd39772efd12
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23100_0857b21c732a7ed4993a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0857b21c732a7ed4993a65439c6eaf325abfcb9203f48ab9216f6f0cd6a1151f
+size 121296
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23100_11f9be8ea7412ec2178a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23100_11f9be8ea7412ec2178a.png
new file mode 100644
index 0000000000000000000000000000000000000000..79f38271e6fe6e0da190b56cabdc69832f62f821
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23100_11f9be8ea7412ec2178a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23100_34f43dd826c625ef997c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23100_34f43dd826c625ef997c.png
new file mode 100644
index 0000000000000000000000000000000000000000..f5d51cb8aa24bd7b8d466bdce601e62f4fd61e1f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23100_34f43dd826c625ef997c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23100_5cf1efb62719c98947a3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23100_5cf1efb62719c98947a3.png
new file mode 100644
index 0000000000000000000000000000000000000000..b68ae62997ba20911aa311fabc0ad8cba81cbd6f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23100_5cf1efb62719c98947a3.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23300_0df164bbba0b192edf2e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23300_0df164bbba0b192edf2e.png
new file mode 100644
index 0000000000000000000000000000000000000000..9d3e1e47614b4399488b50105e512a2d82745038
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23300_0df164bbba0b192edf2e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23300_3b2bc877424589bf861d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23300_3b2bc877424589bf861d.png
new file mode 100644
index 0000000000000000000000000000000000000000..cf3204e6d1104c3737264cd1081972e5fc022e6f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23300_3b2bc877424589bf861d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23300_84bbfb8d87a4b8614b5b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23300_84bbfb8d87a4b8614b5b.png
new file mode 100644
index 0000000000000000000000000000000000000000..ab9802f7713fda18227a33beef9178b18d92b71e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23300_84bbfb8d87a4b8614b5b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:84bbfb8d87a4b8614b5b82492774e980818dbe6eaa05ce8f079f4b65c58e016d
+size 129108
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23300_fa8f87b734fda1cc010a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23300_fa8f87b734fda1cc010a.png
new file mode 100644
index 0000000000000000000000000000000000000000..9a07599953922fa096e657883adb19bc8dd981ac
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23300_fa8f87b734fda1cc010a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23500_23fa46a0ccee6c6baa39.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23500_23fa46a0ccee6c6baa39.png
new file mode 100644
index 0000000000000000000000000000000000000000..4dc698ad1fecdf6c3a83bf5edcf3799ed8574704
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23500_23fa46a0ccee6c6baa39.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23500_47efda8b790781b517db.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23500_47efda8b790781b517db.png
new file mode 100644
index 0000000000000000000000000000000000000000..e7f57af82ad43fdcbf15945b0c6f455195e4aa4a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23500_47efda8b790781b517db.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23500_787d371a902029fde3f7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23500_787d371a902029fde3f7.png
new file mode 100644
index 0000000000000000000000000000000000000000..a08a7a83574a2036525bd29bd113e911d292f286
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23500_787d371a902029fde3f7.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23500_b3f79b73123806dd6708.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23500_b3f79b73123806dd6708.png
new file mode 100644
index 0000000000000000000000000000000000000000..6ae081883ea047692c47b548a646ff639abec35a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23500_b3f79b73123806dd6708.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23700_9e2832cc42c4709a3703.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23700_9e2832cc42c4709a3703.png
new file mode 100644
index 0000000000000000000000000000000000000000..1822f7be371352a29b9e0ebc0693d2332c0e664e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23700_9e2832cc42c4709a3703.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23700_c48057b539c5cc251be0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23700_c48057b539c5cc251be0.png
new file mode 100644
index 0000000000000000000000000000000000000000..3e762567e0bfb338c457eca3b143ab3c2509521e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23700_c48057b539c5cc251be0.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23700_c7e42a4314e194bf4848.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23700_c7e42a4314e194bf4848.png
new file mode 100644
index 0000000000000000000000000000000000000000..a82db4abf7d7489818288da41a6b2d069fb5cc59
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23700_c7e42a4314e194bf4848.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23700_d719fd0c9b972532a1a5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23700_d719fd0c9b972532a1a5.png
new file mode 100644
index 0000000000000000000000000000000000000000..bd2cdf1341afacc3223f90da2c4074370cad2251
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23700_d719fd0c9b972532a1a5.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23900_45f90abac430e4656864.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23900_45f90abac430e4656864.png
new file mode 100644
index 0000000000000000000000000000000000000000..6a52f24a61e1a9a641479827327b2f689a6b4f7b
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23900_45f90abac430e4656864.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23900_72e4988426cc9c10a12d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23900_72e4988426cc9c10a12d.png
new file mode 100644
index 0000000000000000000000000000000000000000..d338a19b4d347901f9465a0d72e011083973b574
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23900_72e4988426cc9c10a12d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23900_76d0ee64c4ee29934c22.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23900_76d0ee64c4ee29934c22.png
new file mode 100644
index 0000000000000000000000000000000000000000..01e84743acb86729a31f2a55034adb335b6fe42f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23900_76d0ee64c4ee29934c22.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23900_8795a0eba40c63211a7b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23900_8795a0eba40c63211a7b.png
new file mode 100644
index 0000000000000000000000000000000000000000..6e9af2108e562f78f09dd1fba9b19c232b4b9579
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_23900_8795a0eba40c63211a7b.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24100_5654a3de1728acf9fb21.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24100_5654a3de1728acf9fb21.png
new file mode 100644
index 0000000000000000000000000000000000000000..15c8bd9aea3e9ce25a8c58e54554560b1bdc7d6a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24100_5654a3de1728acf9fb21.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24100_90eb8bfdcafd0705c7b3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24100_90eb8bfdcafd0705c7b3.png
new file mode 100644
index 0000000000000000000000000000000000000000..29c5421f3728a61727a1064ab4d2e93a3a79920d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24100_90eb8bfdcafd0705c7b3.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24100_9380b1a8c6af6d579c6b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24100_9380b1a8c6af6d579c6b.png
new file mode 100644
index 0000000000000000000000000000000000000000..b5fcb208047351432cd3dce5401948dc41a7442c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24100_9380b1a8c6af6d579c6b.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24100_e9947adba6d1988c284d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24100_e9947adba6d1988c284d.png
new file mode 100644
index 0000000000000000000000000000000000000000..e75d58c71c72d380a9bc55e7e81f5eef8ce67b15
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24100_e9947adba6d1988c284d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24300_03022c4b9620d5dd6ea2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24300_03022c4b9620d5dd6ea2.png
new file mode 100644
index 0000000000000000000000000000000000000000..a427b938671c0a70796ca72352a3e2bc149ac7d0
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24300_03022c4b9620d5dd6ea2.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24300_40a711853cc04dec8bee.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24300_40a711853cc04dec8bee.png
new file mode 100644
index 0000000000000000000000000000000000000000..bb1668ff17b84e29e0442285e34fa37c656f7c96
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24300_40a711853cc04dec8bee.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24300_4f6a857991f5ce5d8d2e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24300_4f6a857991f5ce5d8d2e.png
new file mode 100644
index 0000000000000000000000000000000000000000..bf7d1072dcf07ae18fbf1f05ec8195efaa9d8a1e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24300_4f6a857991f5ce5d8d2e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24300_d7e47244594389df16f4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24300_d7e47244594389df16f4.png
new file mode 100644
index 0000000000000000000000000000000000000000..8cb1838fdd3a68a3d366da72d0dee8424f71beb1
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24300_d7e47244594389df16f4.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24500_2641a6c7f099f8c4594a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24500_2641a6c7f099f8c4594a.png
new file mode 100644
index 0000000000000000000000000000000000000000..0fe3125929fff403a999aac26487aad8fa955abf
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24500_2641a6c7f099f8c4594a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24500_5e5b6c03779e2abbe693.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24500_5e5b6c03779e2abbe693.png
new file mode 100644
index 0000000000000000000000000000000000000000..a791f665cb4d9476bd01e574f29789e1875d1d0c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24500_5e5b6c03779e2abbe693.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24500_b3fb6d7d273c8673955c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24500_b3fb6d7d273c8673955c.png
new file mode 100644
index 0000000000000000000000000000000000000000..c89fea6c1f3bc62ea0adf1c56aef6b417144c523
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24500_b3fb6d7d273c8673955c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b3fb6d7d273c8673955c85a77ac35bccf700387789edfe7a3b990668603088f1
+size 109303
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24500_e3660b2f8ce01915c958.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24500_e3660b2f8ce01915c958.png
new file mode 100644
index 0000000000000000000000000000000000000000..82d1c4d808832ff2996df6c0febd781af2768029
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24500_e3660b2f8ce01915c958.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24700_2a4818824e625b0dad31.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24700_2a4818824e625b0dad31.png
new file mode 100644
index 0000000000000000000000000000000000000000..812624d2c480c5b1a094dadc353ef5bb6279c69d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24700_2a4818824e625b0dad31.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24700_733a3378958d70c356f3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24700_733a3378958d70c356f3.png
new file mode 100644
index 0000000000000000000000000000000000000000..48a632c483deaa6446ace093db43fc06155493d3
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24700_733a3378958d70c356f3.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24700_79256a6971445e65cb2d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24700_79256a6971445e65cb2d.png
new file mode 100644
index 0000000000000000000000000000000000000000..a5bb665572423cba124e9be70867bb7fcb8e4c31
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24700_79256a6971445e65cb2d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24700_c5b9b312f35046221492.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24700_c5b9b312f35046221492.png
new file mode 100644
index 0000000000000000000000000000000000000000..7867ed62172bcccc11bfcca7086fe147015ad176
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24700_c5b9b312f35046221492.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24900_08f11e55770ab6d2bb33.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24900_08f11e55770ab6d2bb33.png
new file mode 100644
index 0000000000000000000000000000000000000000..4a4cb02e7b9af8acfc358616c5e652df5ebd2879
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24900_08f11e55770ab6d2bb33.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24900_1b0b5420c8347a288392.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24900_1b0b5420c8347a288392.png
new file mode 100644
index 0000000000000000000000000000000000000000..9a6f830a768a5a542a6b8f64680509c6e715774c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24900_1b0b5420c8347a288392.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24900_28a29aabbbe9b93e44e4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24900_28a29aabbbe9b93e44e4.png
new file mode 100644
index 0000000000000000000000000000000000000000..9e92889eef84870700808fdb83b9a125962877e3
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24900_28a29aabbbe9b93e44e4.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24900_cf5d97e7dece21341c9d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24900_cf5d97e7dece21341c9d.png
new file mode 100644
index 0000000000000000000000000000000000000000..5764110fd053673f75dd39c2e564fa92447f1804
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_24900_cf5d97e7dece21341c9d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2500_822d7c3a3f471d8670f9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2500_822d7c3a3f471d8670f9.png
new file mode 100644
index 0000000000000000000000000000000000000000..959b2c07eb69b330516f05d0fa857eb716d6457f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2500_822d7c3a3f471d8670f9.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2500_9c35db96d6e44a25bfa5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2500_9c35db96d6e44a25bfa5.png
new file mode 100644
index 0000000000000000000000000000000000000000..30814b39175acb394c2c58d197829778878010f3
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2500_9c35db96d6e44a25bfa5.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2500_a10a762c8d553fcd4008.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2500_a10a762c8d553fcd4008.png
new file mode 100644
index 0000000000000000000000000000000000000000..da1772affdd1f3d59a40a587df85581f2f754588
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2500_a10a762c8d553fcd4008.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2500_ccc1c3e0c017e6c63fde.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2500_ccc1c3e0c017e6c63fde.png
new file mode 100644
index 0000000000000000000000000000000000000000..281244d66fc700bdd38e1d518ad5c825da942d62
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2500_ccc1c3e0c017e6c63fde.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25100_27ddffb029fc1f1fec99.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25100_27ddffb029fc1f1fec99.png
new file mode 100644
index 0000000000000000000000000000000000000000..3c6eb66ef229087b0c5d20122032acdd8d4613ee
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25100_27ddffb029fc1f1fec99.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25100_9cef3feb0ffa78af53d4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25100_9cef3feb0ffa78af53d4.png
new file mode 100644
index 0000000000000000000000000000000000000000..73a452df9ab656dc65a6450c6283d446194c7520
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25100_9cef3feb0ffa78af53d4.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25100_de3889b50b823acb9250.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25100_de3889b50b823acb9250.png
new file mode 100644
index 0000000000000000000000000000000000000000..5258f148e45e93e42cb1244ee68e8b4ab6409de0
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25100_de3889b50b823acb9250.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25100_f82902b0385f9efb58e3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25100_f82902b0385f9efb58e3.png
new file mode 100644
index 0000000000000000000000000000000000000000..306ddcb9fdca7df4a43bbfcc568fe797b0f2cf4f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25100_f82902b0385f9efb58e3.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25300_0d4751a5762c5b15a983.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25300_0d4751a5762c5b15a983.png
new file mode 100644
index 0000000000000000000000000000000000000000..160368d8b275f2ef5e44417d05727d29faec1e2d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25300_0d4751a5762c5b15a983.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25300_7ff1c5cf91f0356da872.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25300_7ff1c5cf91f0356da872.png
new file mode 100644
index 0000000000000000000000000000000000000000..e5cf2dc898a26903e616605e52531063a735e26c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25300_7ff1c5cf91f0356da872.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25300_a9e017e883736bda35ef.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25300_a9e017e883736bda35ef.png
new file mode 100644
index 0000000000000000000000000000000000000000..d046dbf0040bfa6a3da08ab83c48e0d1856ac436
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25300_a9e017e883736bda35ef.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25300_f8ca73b56880df5bed30.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25300_f8ca73b56880df5bed30.png
new file mode 100644
index 0000000000000000000000000000000000000000..23fe90471c926d43b6acc734ad506e180ac3daab
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25300_f8ca73b56880df5bed30.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25500_8a382a3acd1e1502c5d1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25500_8a382a3acd1e1502c5d1.png
new file mode 100644
index 0000000000000000000000000000000000000000..79d039829587028bc76352a2f2a62085dcf25604
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25500_8a382a3acd1e1502c5d1.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25500_c48d35cac6f9f11a1249.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25500_c48d35cac6f9f11a1249.png
new file mode 100644
index 0000000000000000000000000000000000000000..949dd2a26015025c518fbe5958ad4248ff45075d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25500_c48d35cac6f9f11a1249.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25500_d5a11bedb92d09dd3d29.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25500_d5a11bedb92d09dd3d29.png
new file mode 100644
index 0000000000000000000000000000000000000000..3b4429461dee6b5b8f60a132f4bc4bae202f4893
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25500_d5a11bedb92d09dd3d29.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25500_da8faa22acb19686bce9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25500_da8faa22acb19686bce9.png
new file mode 100644
index 0000000000000000000000000000000000000000..b429bcbb4b0dd9085782fb1dc60dd4c4eb1fa448
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25500_da8faa22acb19686bce9.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:da8faa22acb19686bce92229d509653356f94db6256fe3a3f703de2da4f2d2a6
+size 143054
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25700_04ae70744067fc0f7a71.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25700_04ae70744067fc0f7a71.png
new file mode 100644
index 0000000000000000000000000000000000000000..8fe15c47a61768980bf97dd67a5ea868514a12e7
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25700_04ae70744067fc0f7a71.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25700_264dba17c3af6fa37987.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25700_264dba17c3af6fa37987.png
new file mode 100644
index 0000000000000000000000000000000000000000..575b5574d6b395f7b4654785acb929db40c88353
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25700_264dba17c3af6fa37987.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25700_4e02faa7484219d01730.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25700_4e02faa7484219d01730.png
new file mode 100644
index 0000000000000000000000000000000000000000..fbb09ecbebb18014b18af653eae6d796a91537de
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25700_4e02faa7484219d01730.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25700_d7751e656c48fae969b3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25700_d7751e656c48fae969b3.png
new file mode 100644
index 0000000000000000000000000000000000000000..e1af30c38deb79321173deba71f399ecadd1319a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25700_d7751e656c48fae969b3.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25900_020c933113b887c8944b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25900_020c933113b887c8944b.png
new file mode 100644
index 0000000000000000000000000000000000000000..3975e3c22cf1640ee5eb658221fe5e9731db0b40
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25900_020c933113b887c8944b.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25900_3af4a8aecd330094fc0c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25900_3af4a8aecd330094fc0c.png
new file mode 100644
index 0000000000000000000000000000000000000000..eb3510d6e9b3b7991a20427bcbc273cd38b24d92
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25900_3af4a8aecd330094fc0c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25900_d728b35811e4cbc304b2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25900_d728b35811e4cbc304b2.png
new file mode 100644
index 0000000000000000000000000000000000000000..e42b0d309ac45ce5ef393c60d5a76c1f3a267cd5
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25900_d728b35811e4cbc304b2.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25900_fb66da5ff85384515dbd.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25900_fb66da5ff85384515dbd.png
new file mode 100644
index 0000000000000000000000000000000000000000..3cef3a7001203c96b9929dfb2d32770c20654e30
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_25900_fb66da5ff85384515dbd.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26100_3fe45c01529c78c8d504.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26100_3fe45c01529c78c8d504.png
new file mode 100644
index 0000000000000000000000000000000000000000..5fca5e5f3f7585dc61c6ab7777dc45946320c1fe
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26100_3fe45c01529c78c8d504.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3fe45c01529c78c8d504692205af1fb5835e0cd2b9914b4906c81181d9bb576b
+size 110954
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26100_bbae61af96b574bbdefa.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26100_bbae61af96b574bbdefa.png
new file mode 100644
index 0000000000000000000000000000000000000000..b585d9206c301ea5931333b0356b70dc2226ea6b
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26100_bbae61af96b574bbdefa.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26100_fa665053de22681c71f6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26100_fa665053de22681c71f6.png
new file mode 100644
index 0000000000000000000000000000000000000000..d3c3cf06c69cad27a41ebc5df4fe67678592e587
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26100_fa665053de22681c71f6.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26100_fc3288b22b226bdcb6ce.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26100_fc3288b22b226bdcb6ce.png
new file mode 100644
index 0000000000000000000000000000000000000000..64b104c98405b7cf65790ba467d2da4e3c5ec36a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26100_fc3288b22b226bdcb6ce.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26300_4d2c7d90f893c3301510.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26300_4d2c7d90f893c3301510.png
new file mode 100644
index 0000000000000000000000000000000000000000..2cdb926268f8cc0275fb97dedd2b94ae6a95540e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26300_4d2c7d90f893c3301510.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26300_6cb10d040c4e9749b917.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26300_6cb10d040c4e9749b917.png
new file mode 100644
index 0000000000000000000000000000000000000000..127d3ffa9a865feb46efa45e9df994bf5abfef75
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26300_6cb10d040c4e9749b917.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26300_7c86be018a56fb14c3b1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26300_7c86be018a56fb14c3b1.png
new file mode 100644
index 0000000000000000000000000000000000000000..1993af270a8ade05af6659256c9b381de6c1ded4
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26300_7c86be018a56fb14c3b1.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26300_ce80edf0c1f79b132f5e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26300_ce80edf0c1f79b132f5e.png
new file mode 100644
index 0000000000000000000000000000000000000000..8ab7f6af48b0fcd0879adae2b51143171ba5efbe
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26300_ce80edf0c1f79b132f5e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26500_0dae4e0e201b98cf71ed.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26500_0dae4e0e201b98cf71ed.png
new file mode 100644
index 0000000000000000000000000000000000000000..d363a470665b059710499a443e1892b2a0a9b8fb
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26500_0dae4e0e201b98cf71ed.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26500_36c8233928c229cf2f1e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26500_36c8233928c229cf2f1e.png
new file mode 100644
index 0000000000000000000000000000000000000000..dbd99db9dc69125456cf9b269d17d0e70f6f6fc6
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26500_36c8233928c229cf2f1e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26500_48b4e59abca0e118db89.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26500_48b4e59abca0e118db89.png
new file mode 100644
index 0000000000000000000000000000000000000000..a30a9b57f18c4207c0807ce11ddd73f7f20ab15c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26500_48b4e59abca0e118db89.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26500_c44e257813a35f28bc4b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26500_c44e257813a35f28bc4b.png
new file mode 100644
index 0000000000000000000000000000000000000000..8d67b339baddf4c0cc136fc65d6f229e5533b2bd
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26500_c44e257813a35f28bc4b.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26700_35c7c682f9d63f91b940.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26700_35c7c682f9d63f91b940.png
new file mode 100644
index 0000000000000000000000000000000000000000..b145e633e64d3cec06ec6b3c201e40ceaddad0bb
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26700_35c7c682f9d63f91b940.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26700_510b950ac1f443e595c1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26700_510b950ac1f443e595c1.png
new file mode 100644
index 0000000000000000000000000000000000000000..4de9dd80bdea1fbc242e797dabb25fdae035fc05
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26700_510b950ac1f443e595c1.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26700_82d2aad4d232755c97bd.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26700_82d2aad4d232755c97bd.png
new file mode 100644
index 0000000000000000000000000000000000000000..fadf11a81df72a9f04eb45e8bc681d5c2793bff3
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26700_82d2aad4d232755c97bd.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26700_c5325b67c5416a28b5a3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26700_c5325b67c5416a28b5a3.png
new file mode 100644
index 0000000000000000000000000000000000000000..0505727cb609f77183a2a78af87552de922a958a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26700_c5325b67c5416a28b5a3.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26900_01d8ec690e569594bc2f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26900_01d8ec690e569594bc2f.png
new file mode 100644
index 0000000000000000000000000000000000000000..226f4a1cc6a6c896d86c91b2aea93532a9ea327a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26900_01d8ec690e569594bc2f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26900_450e89d4356939450660.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26900_450e89d4356939450660.png
new file mode 100644
index 0000000000000000000000000000000000000000..4cfd1d2211b08fae122d1a0e29631108546885ca
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26900_450e89d4356939450660.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26900_e69e9fbbdc937c0cb726.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26900_e69e9fbbdc937c0cb726.png
new file mode 100644
index 0000000000000000000000000000000000000000..7c510353e0a5dad6c31dc99de9bf415056103e93
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26900_e69e9fbbdc937c0cb726.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26900_f9f8fc85297c0a17056c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26900_f9f8fc85297c0a17056c.png
new file mode 100644
index 0000000000000000000000000000000000000000..5be60cce89a8e62dca8d3058dd6f78e264b44016
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_26900_f9f8fc85297c0a17056c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2700_353160c4965a0e8d39ad.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2700_353160c4965a0e8d39ad.png
new file mode 100644
index 0000000000000000000000000000000000000000..035dcd148b3a5048335785a2db7b238ceeae285c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2700_353160c4965a0e8d39ad.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2700_a977f567e411bd553fc8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2700_a977f567e411bd553fc8.png
new file mode 100644
index 0000000000000000000000000000000000000000..1a63bc3bac884226f279204d73330be4aefa5a70
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2700_a977f567e411bd553fc8.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2700_ba55d5593af6fed3beb0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2700_ba55d5593af6fed3beb0.png
new file mode 100644
index 0000000000000000000000000000000000000000..6df431b67d14e102ab62110deb08f5905b4119df
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2700_ba55d5593af6fed3beb0.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2700_cd08de5ee9ba214b9952.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2700_cd08de5ee9ba214b9952.png
new file mode 100644
index 0000000000000000000000000000000000000000..cd4bcdd9a01918d5aaf6d7a4f3aac8b98de99fe5
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2700_cd08de5ee9ba214b9952.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27100_2d0f1e2315284004c725.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27100_2d0f1e2315284004c725.png
new file mode 100644
index 0000000000000000000000000000000000000000..f1d22e72717dd709bf3ddd2026c1762e167fd0c4
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27100_2d0f1e2315284004c725.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27100_b4cc0c2bfd1277f93836.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27100_b4cc0c2bfd1277f93836.png
new file mode 100644
index 0000000000000000000000000000000000000000..2feaeea9b91e09e8fcc3d0360171662c29fa2074
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27100_b4cc0c2bfd1277f93836.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27100_dfc6d41204f0f7857153.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27100_dfc6d41204f0f7857153.png
new file mode 100644
index 0000000000000000000000000000000000000000..c25f6e15dbcae7f450d9efccef284356e83c9b12
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27100_dfc6d41204f0f7857153.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27100_efa3f0f6885d9948b522.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27100_efa3f0f6885d9948b522.png
new file mode 100644
index 0000000000000000000000000000000000000000..75443ab5397a96b68fcc58aea0e0305e854bbafb
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27100_efa3f0f6885d9948b522.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27300_4388644b996781ac7ec2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27300_4388644b996781ac7ec2.png
new file mode 100644
index 0000000000000000000000000000000000000000..f07663a153e1ae32ed92375de91992e0cb6af85a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27300_4388644b996781ac7ec2.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27300_55ec94e19103a9fea400.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27300_55ec94e19103a9fea400.png
new file mode 100644
index 0000000000000000000000000000000000000000..e44a96873b13c8c82089543ab4bd667d86970366
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27300_55ec94e19103a9fea400.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27300_aad4379dbccf5de8a481.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27300_aad4379dbccf5de8a481.png
new file mode 100644
index 0000000000000000000000000000000000000000..59ada5cb522c500fbe156b93a63dc79c669b2c71
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27300_aad4379dbccf5de8a481.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27300_ef3befa0d5a54c10e930.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27300_ef3befa0d5a54c10e930.png
new file mode 100644
index 0000000000000000000000000000000000000000..159356b68b5e24cb707665bf20b5b80d43caa07e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27300_ef3befa0d5a54c10e930.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27500_3bcaf5af71bce17c52b7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27500_3bcaf5af71bce17c52b7.png
new file mode 100644
index 0000000000000000000000000000000000000000..803308b3fb55b6ee34e0cf893fd3d9ad87d28e62
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27500_3bcaf5af71bce17c52b7.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27500_79e0706b94727d77f91b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27500_79e0706b94727d77f91b.png
new file mode 100644
index 0000000000000000000000000000000000000000..6dae1073fc756fb7e91b26803f8f1bd55c267ba0
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27500_79e0706b94727d77f91b.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27500_a30680a4348d36d34318.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27500_a30680a4348d36d34318.png
new file mode 100644
index 0000000000000000000000000000000000000000..e24883610293e2bb0737407871f6efc6ee831cf5
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27500_a30680a4348d36d34318.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27500_b68285ce13151dfdcc84.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27500_b68285ce13151dfdcc84.png
new file mode 100644
index 0000000000000000000000000000000000000000..e09ff8b7a337169725898ce5ab687f92fca89a7d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27500_b68285ce13151dfdcc84.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27700_62d16d9b5ea349f279a9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27700_62d16d9b5ea349f279a9.png
new file mode 100644
index 0000000000000000000000000000000000000000..3ca1644dc0b014a981e82d39e117ab65ea29a331
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27700_62d16d9b5ea349f279a9.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27700_aa902397720cb7166106.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27700_aa902397720cb7166106.png
new file mode 100644
index 0000000000000000000000000000000000000000..74b0ff7a2f11d2da7e258b9b20580b8a20eac166
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27700_aa902397720cb7166106.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27700_d05db8eac4d19904a8d7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27700_d05db8eac4d19904a8d7.png
new file mode 100644
index 0000000000000000000000000000000000000000..f6ffacc11e38e8250dd55c7464314073085e99ff
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27700_d05db8eac4d19904a8d7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d05db8eac4d19904a8d77da3d67aab10a4d7b378890c52edb86fa24502be7c5f
+size 104398
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27700_eeeb88ff332d80dbfb52.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27700_eeeb88ff332d80dbfb52.png
new file mode 100644
index 0000000000000000000000000000000000000000..78ef0ed34839f1414f6aaabc9c91286a93af1928
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27700_eeeb88ff332d80dbfb52.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27900_25f5202a3970620de593.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27900_25f5202a3970620de593.png
new file mode 100644
index 0000000000000000000000000000000000000000..44ab3d6b5e2c1e1ff3798f9dce05a70e0f98c4bc
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27900_25f5202a3970620de593.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27900_df3ce7e9f174333e23ab.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27900_df3ce7e9f174333e23ab.png
new file mode 100644
index 0000000000000000000000000000000000000000..42d395a13fa20a68787ef159a7dd9930ef39d490
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27900_df3ce7e9f174333e23ab.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27900_e590e8ded226e324dc3d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27900_e590e8ded226e324dc3d.png
new file mode 100644
index 0000000000000000000000000000000000000000..547d8126d8449090ef7082bc67938a3ca3fd4523
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27900_e590e8ded226e324dc3d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27900_e67f705e507e24a63c67.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27900_e67f705e507e24a63c67.png
new file mode 100644
index 0000000000000000000000000000000000000000..f451144755ce357be2d14b3356a0135b4ea13526
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_27900_e67f705e507e24a63c67.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28100_185ae4bd98f0e9cefa05.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28100_185ae4bd98f0e9cefa05.png
new file mode 100644
index 0000000000000000000000000000000000000000..6bfe28150cd15b3675f78ae0a80a0354215c45f7
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28100_185ae4bd98f0e9cefa05.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28100_58b847c182d158acc1f4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28100_58b847c182d158acc1f4.png
new file mode 100644
index 0000000000000000000000000000000000000000..ea16c4278d99d373fe5209181abdf046ea1e2b20
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28100_58b847c182d158acc1f4.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28100_cf7323b8f923f12ff4d1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28100_cf7323b8f923f12ff4d1.png
new file mode 100644
index 0000000000000000000000000000000000000000..3e74c0c4fefd6e813e332efc3a41fd3daf13b99a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28100_cf7323b8f923f12ff4d1.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28100_ef5de000203efe98afa9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28100_ef5de000203efe98afa9.png
new file mode 100644
index 0000000000000000000000000000000000000000..c4070ea55b5316a435eb1fe0df18cb93ee10778e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28100_ef5de000203efe98afa9.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28300_0df869fbfcd2a6292a19.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28300_0df869fbfcd2a6292a19.png
new file mode 100644
index 0000000000000000000000000000000000000000..3e0fd961f74f4d41fbf8735617c89e50b468377c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28300_0df869fbfcd2a6292a19.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28300_24ecbdf2e8b9a914df0a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28300_24ecbdf2e8b9a914df0a.png
new file mode 100644
index 0000000000000000000000000000000000000000..62c7ad75e53055b07dd7c0898c6d6a20d810dc9c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28300_24ecbdf2e8b9a914df0a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28300_bdce1de5653065141a7c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28300_bdce1de5653065141a7c.png
new file mode 100644
index 0000000000000000000000000000000000000000..f172ad92e049c7515a015a4ea0e61a85a60f7cdc
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28300_bdce1de5653065141a7c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28300_fd9321df7280f001ab9b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28300_fd9321df7280f001ab9b.png
new file mode 100644
index 0000000000000000000000000000000000000000..4dfe2663741be81f7e6f4c17f403acef9bd84b67
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28300_fd9321df7280f001ab9b.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28500_24742b81a0a84c89f145.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28500_24742b81a0a84c89f145.png
new file mode 100644
index 0000000000000000000000000000000000000000..f9fea3b4ce93c61e67537661a6b871f21ebf7be8
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28500_24742b81a0a84c89f145.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28500_a1e1c751d906955c86ea.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28500_a1e1c751d906955c86ea.png
new file mode 100644
index 0000000000000000000000000000000000000000..89884cbc780bea4001241b6b0a1d990a81dda545
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28500_a1e1c751d906955c86ea.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a1e1c751d906955c86eab97f55b4221ef2edcee282b87a2a7bfe4519da0178ff
+size 110047
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28500_be2f527aceabf9751a15.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28500_be2f527aceabf9751a15.png
new file mode 100644
index 0000000000000000000000000000000000000000..83e19190db58d793410f860855098fa9c9f667ff
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28500_be2f527aceabf9751a15.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28500_fd6f79d8cf3fe8a5b387.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28500_fd6f79d8cf3fe8a5b387.png
new file mode 100644
index 0000000000000000000000000000000000000000..dee0e30f984c499724feb48369160cad86b7af4a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28500_fd6f79d8cf3fe8a5b387.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28700_675699760cf867916c86.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28700_675699760cf867916c86.png
new file mode 100644
index 0000000000000000000000000000000000000000..90657cd4e454acb230fb0f5a44b7011a6906cee7
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28700_675699760cf867916c86.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:675699760cf867916c8645ed54384c11891615419f23ff6cfe2d972808b0da64
+size 137237
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28700_9a19bc527fb2465190ba.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28700_9a19bc527fb2465190ba.png
new file mode 100644
index 0000000000000000000000000000000000000000..4ca6659a5507c309d314b4e3bbe7d348efddd791
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28700_9a19bc527fb2465190ba.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28700_e22b43d9a099503c175f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28700_e22b43d9a099503c175f.png
new file mode 100644
index 0000000000000000000000000000000000000000..f4dfb122ac49b9d892889b3caa6e05d59689b955
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28700_e22b43d9a099503c175f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28700_e726bbe8293024e6a6ec.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28700_e726bbe8293024e6a6ec.png
new file mode 100644
index 0000000000000000000000000000000000000000..5c4f91af28b3659b2db82435d64d6d636a0e7433
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28700_e726bbe8293024e6a6ec.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28900_5b7d21713774fad5cb5b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28900_5b7d21713774fad5cb5b.png
new file mode 100644
index 0000000000000000000000000000000000000000..3d0192cfda477177427e4f6ec11957b7a641a176
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28900_5b7d21713774fad5cb5b.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28900_b06e099f7f849edc857e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28900_b06e099f7f849edc857e.png
new file mode 100644
index 0000000000000000000000000000000000000000..a90e26902a0667ba38327a2e517896c176b175fe
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28900_b06e099f7f849edc857e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28900_baa1d4294af5a0ddefa8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28900_baa1d4294af5a0ddefa8.png
new file mode 100644
index 0000000000000000000000000000000000000000..fcd2f4fd56759baa4d2d1667a0f30dce7b16595b
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28900_baa1d4294af5a0ddefa8.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28900_d6f495ef00be065faf41.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28900_d6f495ef00be065faf41.png
new file mode 100644
index 0000000000000000000000000000000000000000..f39050e722049b93ed7c3e167977a31b6f344b8b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_28900_d6f495ef00be065faf41.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d6f495ef00be065faf41a24035179b3e6880417b7d3e25a4cb7badc4f405ae12
+size 106267
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2900_40bae7d44e50241b1e93.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2900_40bae7d44e50241b1e93.png
new file mode 100644
index 0000000000000000000000000000000000000000..ae9b5cdf0e7e62ac04988918ddbc558f3cd909f2
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2900_40bae7d44e50241b1e93.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2900_71a667cbd70232467c89.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2900_71a667cbd70232467c89.png
new file mode 100644
index 0000000000000000000000000000000000000000..81aebcb86c9d044c1ee863d360cf5ce0d470155d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2900_71a667cbd70232467c89.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2900_9098c0ee48aab8ccd6f7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2900_9098c0ee48aab8ccd6f7.png
new file mode 100644
index 0000000000000000000000000000000000000000..7f69852c67114799bab2b084a5dc243d94845dda
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2900_9098c0ee48aab8ccd6f7.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2900_d85d7e1f3906b6990526.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2900_d85d7e1f3906b6990526.png
new file mode 100644
index 0000000000000000000000000000000000000000..4c56fb48444a1beb5a9d7577ee35e85f0c606870
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_2900_d85d7e1f3906b6990526.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29100_5b5413b58d26f633709a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29100_5b5413b58d26f633709a.png
new file mode 100644
index 0000000000000000000000000000000000000000..63b21c50dc32902778e9bb82ed2bd6d17aca9d5a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29100_5b5413b58d26f633709a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29100_5e0a4e7e8e6e9f345cae.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29100_5e0a4e7e8e6e9f345cae.png
new file mode 100644
index 0000000000000000000000000000000000000000..fe69633010cd0c95ce0b08fcfa33847ef3bfd8eb
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29100_5e0a4e7e8e6e9f345cae.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29100_94362947e529e2b1d58f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29100_94362947e529e2b1d58f.png
new file mode 100644
index 0000000000000000000000000000000000000000..24a8d3125223c3738998e3f9c36dea436a1e8654
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29100_94362947e529e2b1d58f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29100_bddfc6c5c27d88de84f8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29100_bddfc6c5c27d88de84f8.png
new file mode 100644
index 0000000000000000000000000000000000000000..ede40f1ee5b13e6c3a8686f9dbd4234486c2b022
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29100_bddfc6c5c27d88de84f8.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29300_1b5d3f827b3e19d6f65e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29300_1b5d3f827b3e19d6f65e.png
new file mode 100644
index 0000000000000000000000000000000000000000..d4580743a9653e84c7097929943785c4477ad7eb
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29300_1b5d3f827b3e19d6f65e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29300_1cbd40f9833b18f61bc8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29300_1cbd40f9833b18f61bc8.png
new file mode 100644
index 0000000000000000000000000000000000000000..c37782dd9a126e0eec9e3a9c9398668b761dc87e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29300_1cbd40f9833b18f61bc8.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29300_32deac1f1e7f19f60c35.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29300_32deac1f1e7f19f60c35.png
new file mode 100644
index 0000000000000000000000000000000000000000..cbaa6bba883bd93cbfb0a25b0e92886aa3596457
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29300_32deac1f1e7f19f60c35.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29300_4f2eb021e7fd80c34907.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29300_4f2eb021e7fd80c34907.png
new file mode 100644
index 0000000000000000000000000000000000000000..26d56bff67734c749accd0d43a13d36f7dd71b4a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29300_4f2eb021e7fd80c34907.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29500_3e0fffc5fec3271d19a4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29500_3e0fffc5fec3271d19a4.png
new file mode 100644
index 0000000000000000000000000000000000000000..16d5e5d098d4aa175d3aa82eef6d8669b49729f0
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29500_3e0fffc5fec3271d19a4.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29500_5e095a706396dda390cf.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29500_5e095a706396dda390cf.png
new file mode 100644
index 0000000000000000000000000000000000000000..89d1032aa4932f0f5e872b842f5821880ed14f63
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29500_5e095a706396dda390cf.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29500_d83dd9f5418bf86035bb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29500_d83dd9f5418bf86035bb.png
new file mode 100644
index 0000000000000000000000000000000000000000..2ce2aa62565326d2f61f506898db07eb39ab88b1
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29500_d83dd9f5418bf86035bb.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d83dd9f5418bf86035bbfcb44dd98b4d3321e8c5e61c846ba31693a833b69bfd
+size 140471
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29500_f0db93b2ab5970f13a5e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29500_f0db93b2ab5970f13a5e.png
new file mode 100644
index 0000000000000000000000000000000000000000..f2df32f50a85a19f37f5895975aa51eea8a03fd2
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29500_f0db93b2ab5970f13a5e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29700_1213f9e1732469203d0c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29700_1213f9e1732469203d0c.png
new file mode 100644
index 0000000000000000000000000000000000000000..c5aa52eb8f9684200d72195107e0a976d00d1bca
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29700_1213f9e1732469203d0c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29700_48603f09876fc4b5ec5f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29700_48603f09876fc4b5ec5f.png
new file mode 100644
index 0000000000000000000000000000000000000000..2f857d6f817f644faa04d0c6739ab24c0d22b41e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29700_48603f09876fc4b5ec5f.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:48603f09876fc4b5ec5fb7d8666ae32c53cbfcc8c1a33e9cc51e177157311301
+size 104757
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29700_778926ec198fb87a8b4f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29700_778926ec198fb87a8b4f.png
new file mode 100644
index 0000000000000000000000000000000000000000..6dca0e23a971089f09fb1a73d1370e9026b7eb44
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29700_778926ec198fb87a8b4f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29700_f402e325575639b641fc.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29700_f402e325575639b641fc.png
new file mode 100644
index 0000000000000000000000000000000000000000..fa87a0bd18a09b7b51ccd35aa96134cf416756d0
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29700_f402e325575639b641fc.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29900_20f6b49fb5efb453ac70.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29900_20f6b49fb5efb453ac70.png
new file mode 100644
index 0000000000000000000000000000000000000000..8b6cdd88ca856b9c3799bee3ae73c6a6f3b466e9
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29900_20f6b49fb5efb453ac70.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29900_3ffdc8a05757b312c449.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29900_3ffdc8a05757b312c449.png
new file mode 100644
index 0000000000000000000000000000000000000000..a2f936720c4f532b46ada4a80a3a1d792fcb731d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29900_3ffdc8a05757b312c449.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29900_beaf4095196f89d8870f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29900_beaf4095196f89d8870f.png
new file mode 100644
index 0000000000000000000000000000000000000000..b7f714749626bc9b1d55fb7b5d0c9d32b0f2bdba
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29900_beaf4095196f89d8870f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29900_d5309211bb3806de0084.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29900_d5309211bb3806de0084.png
new file mode 100644
index 0000000000000000000000000000000000000000..e673e5bf8f12e39e37bcdbff9613774900600373
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_29900_d5309211bb3806de0084.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_300_42de2d643452884a1c32.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_300_42de2d643452884a1c32.png
new file mode 100644
index 0000000000000000000000000000000000000000..e573cc528d8ce337c68afc252f8269eeef97fb55
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_300_42de2d643452884a1c32.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_300_7feeb97ff98987588b2b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_300_7feeb97ff98987588b2b.png
new file mode 100644
index 0000000000000000000000000000000000000000..ec01e5121309359c89e315d4dc34ce37534b747d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_300_7feeb97ff98987588b2b.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_300_a113a326a28660c8642a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_300_a113a326a28660c8642a.png
new file mode 100644
index 0000000000000000000000000000000000000000..a10fd4132f44e5e217e734df90c6a82c34fdec9b
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_300_a113a326a28660c8642a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_300_a87e436767cced8489ae.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_300_a87e436767cced8489ae.png
new file mode 100644
index 0000000000000000000000000000000000000000..94b2e65074336d252db8b136c1436aa510ce77ef
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_300_a87e436767cced8489ae.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30100_092bf2b1b51d02355eb1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30100_092bf2b1b51d02355eb1.png
new file mode 100644
index 0000000000000000000000000000000000000000..35cffe2800cf05489358af7ae45b1288ed64a654
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30100_092bf2b1b51d02355eb1.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30100_550042693a4b916586ca.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30100_550042693a4b916586ca.png
new file mode 100644
index 0000000000000000000000000000000000000000..befd34630d0e419eea8b12947f2ee735cf0d3120
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30100_550042693a4b916586ca.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30100_9bf98067fcec1dc0c41b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30100_9bf98067fcec1dc0c41b.png
new file mode 100644
index 0000000000000000000000000000000000000000..91be73f8eed5141525e868a3de7666a1c021324c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30100_9bf98067fcec1dc0c41b.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30100_dcba2929f0a853c47c9c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30100_dcba2929f0a853c47c9c.png
new file mode 100644
index 0000000000000000000000000000000000000000..3624db96c78d3780d1d90dc9d43577b2396a731f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30100_dcba2929f0a853c47c9c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30300_272cb459235f1c7d2f2c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30300_272cb459235f1c7d2f2c.png
new file mode 100644
index 0000000000000000000000000000000000000000..7f2b20405b51978b850a86c779d1096e5c2adb78
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30300_272cb459235f1c7d2f2c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30300_6bebe64308c092bf5b3e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30300_6bebe64308c092bf5b3e.png
new file mode 100644
index 0000000000000000000000000000000000000000..e5769a0aa89528d211daa406c3d32baf632df53a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30300_6bebe64308c092bf5b3e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30300_87fae132cdfd7faa4bbb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30300_87fae132cdfd7faa4bbb.png
new file mode 100644
index 0000000000000000000000000000000000000000..4dd8b31807bf48aaee75fb54b08f924b1d49c38d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30300_87fae132cdfd7faa4bbb.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30300_cbdde5f26ae91e9437d9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30300_cbdde5f26ae91e9437d9.png
new file mode 100644
index 0000000000000000000000000000000000000000..50c98b83bc91f5013b2d26defba42afd69cdfe8e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30300_cbdde5f26ae91e9437d9.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30500_77504dbd120c61528134.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30500_77504dbd120c61528134.png
new file mode 100644
index 0000000000000000000000000000000000000000..da66e6a096ffa420bad66cdc46bce8eaabb2cb1c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30500_77504dbd120c61528134.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30500_9e1af6d7ddd8e5483b73.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30500_9e1af6d7ddd8e5483b73.png
new file mode 100644
index 0000000000000000000000000000000000000000..731a8501fbc6d93790cd2ff2305f100ad1080ee1
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30500_9e1af6d7ddd8e5483b73.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30500_b16a24322a818d4cb191.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30500_b16a24322a818d4cb191.png
new file mode 100644
index 0000000000000000000000000000000000000000..844db25ce6fa6871d4cf33e3273b04b089e927f4
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30500_b16a24322a818d4cb191.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30500_c965d4e56673c52ffa43.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30500_c965d4e56673c52ffa43.png
new file mode 100644
index 0000000000000000000000000000000000000000..01cadf262539e6dd25189f7ff0d536a2412ebd21
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30500_c965d4e56673c52ffa43.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30700_20111d6171f94c5c3c5a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30700_20111d6171f94c5c3c5a.png
new file mode 100644
index 0000000000000000000000000000000000000000..97607555659a7a6496b3220375a2f26af04a98cf
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30700_20111d6171f94c5c3c5a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30700_4f3000a18dc3bc82184f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30700_4f3000a18dc3bc82184f.png
new file mode 100644
index 0000000000000000000000000000000000000000..29b5bc0be16bc417fe12996f3081454e49aa6226
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30700_4f3000a18dc3bc82184f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30700_a529d718e32887bf9bf8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30700_a529d718e32887bf9bf8.png
new file mode 100644
index 0000000000000000000000000000000000000000..f864584153ba0b13747f89ca11cefa8aedb4f851
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30700_a529d718e32887bf9bf8.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30700_bd95e06b9d192850182a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30700_bd95e06b9d192850182a.png
new file mode 100644
index 0000000000000000000000000000000000000000..7fb549038a1d9fa4400a076487f1d476a789b310
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30700_bd95e06b9d192850182a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30900_18ae48e7baae5f7dc6ba.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30900_18ae48e7baae5f7dc6ba.png
new file mode 100644
index 0000000000000000000000000000000000000000..7499574a3222b35e331f3f52d33236538d623a5f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30900_18ae48e7baae5f7dc6ba.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30900_23d21f62a44bab8e7123.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30900_23d21f62a44bab8e7123.png
new file mode 100644
index 0000000000000000000000000000000000000000..c9805a381a66c3c90f2a2c79afbb7d93f6550c3f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30900_23d21f62a44bab8e7123.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30900_b6482336ab1f4bb4b196.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30900_b6482336ab1f4bb4b196.png
new file mode 100644
index 0000000000000000000000000000000000000000..98ef5bf29aa0644887e1710af067e7a8f9f3e41a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30900_b6482336ab1f4bb4b196.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30900_c2f93dd393ff1857a6a5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30900_c2f93dd393ff1857a6a5.png
new file mode 100644
index 0000000000000000000000000000000000000000..3c2f7bf5cce23e1f3b0bbe3689eb8fb2ed184103
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_30900_c2f93dd393ff1857a6a5.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3100_1953fc0b9f303e8a24fe.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3100_1953fc0b9f303e8a24fe.png
new file mode 100644
index 0000000000000000000000000000000000000000..208cee6a690ea97a788793db3bb8bb783d2783e7
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3100_1953fc0b9f303e8a24fe.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3100_2c64b6e3d69e9e953fe7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3100_2c64b6e3d69e9e953fe7.png
new file mode 100644
index 0000000000000000000000000000000000000000..ec126180c8d7989ad49ee95a4cf0b5a5f5f9ffb5
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3100_2c64b6e3d69e9e953fe7.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3100_67c3634488a7920c7d97.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3100_67c3634488a7920c7d97.png
new file mode 100644
index 0000000000000000000000000000000000000000..b2efc38e698ad99ed54d4df90bb98d4d5903856d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3100_67c3634488a7920c7d97.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3100_f62583e71f3a2b83b124.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3100_f62583e71f3a2b83b124.png
new file mode 100644
index 0000000000000000000000000000000000000000..059f7aa459f5f332705d394ef4f4f93f85b71377
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3100_f62583e71f3a2b83b124.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31100_0b2cc8f77331dad10c29.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31100_0b2cc8f77331dad10c29.png
new file mode 100644
index 0000000000000000000000000000000000000000..4b1a03532f5837b79dc991909f1cc0fe9fdfb428
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31100_0b2cc8f77331dad10c29.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31100_3a1fd9568cd4cf9f9124.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31100_3a1fd9568cd4cf9f9124.png
new file mode 100644
index 0000000000000000000000000000000000000000..142de1553d902c7263906416ba204a3e70b89fe3
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31100_3a1fd9568cd4cf9f9124.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31100_83a39f5f5eb28572c2d4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31100_83a39f5f5eb28572c2d4.png
new file mode 100644
index 0000000000000000000000000000000000000000..bbab55fd3c4793d39bd3025d9511b94a7675dcb0
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31100_83a39f5f5eb28572c2d4.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31100_fbb863d043facffdb520.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31100_fbb863d043facffdb520.png
new file mode 100644
index 0000000000000000000000000000000000000000..24c4dfddfca56a9ba3fc6387c295ccc5eeb10a78
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31100_fbb863d043facffdb520.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31300_2adcd45ccb8d907ab933.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31300_2adcd45ccb8d907ab933.png
new file mode 100644
index 0000000000000000000000000000000000000000..c09f49267253d2aba0eaf4c6250a886edb82988e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31300_2adcd45ccb8d907ab933.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31300_70e638f566afea9c9c73.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31300_70e638f566afea9c9c73.png
new file mode 100644
index 0000000000000000000000000000000000000000..21b0db125d3542c78a2fbf824c4c3b3b536a462d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31300_70e638f566afea9c9c73.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31300_b469437c94709b26f790.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31300_b469437c94709b26f790.png
new file mode 100644
index 0000000000000000000000000000000000000000..890f8e7449708cf551f3e18d0081893ae18d9c68
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31300_b469437c94709b26f790.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31300_cec4ea88d2fb2d3faac8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31300_cec4ea88d2fb2d3faac8.png
new file mode 100644
index 0000000000000000000000000000000000000000..5ee5af427a4f0d73825c06832a89bbbb4b9501df
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31300_cec4ea88d2fb2d3faac8.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31500_429a210aa23640c4ca0f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31500_429a210aa23640c4ca0f.png
new file mode 100644
index 0000000000000000000000000000000000000000..9a2b254fad176bcab5bea906b476bde2d348d6b1
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31500_429a210aa23640c4ca0f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31500_86831eb0a6650fa5826a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31500_86831eb0a6650fa5826a.png
new file mode 100644
index 0000000000000000000000000000000000000000..ead82adb0670aef98f1c6f2e03d9615feee51d7d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31500_86831eb0a6650fa5826a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31500_9b4fefda846e0182d7e2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31500_9b4fefda846e0182d7e2.png
new file mode 100644
index 0000000000000000000000000000000000000000..cdf46927c518979e591e809e95e8b290746028c3
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31500_9b4fefda846e0182d7e2.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31500_b0438d7b997522b7553c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31500_b0438d7b997522b7553c.png
new file mode 100644
index 0000000000000000000000000000000000000000..0e87b2e005c354c4dca674d7a40edcfa4bdf2839
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31500_b0438d7b997522b7553c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31700_c3fac261d19ba03a8e8d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31700_c3fac261d19ba03a8e8d.png
new file mode 100644
index 0000000000000000000000000000000000000000..6325f89a4d1efae47ab309b5d79502501393af23
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31700_c3fac261d19ba03a8e8d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31700_c4116526373a2bb94aea.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31700_c4116526373a2bb94aea.png
new file mode 100644
index 0000000000000000000000000000000000000000..9041797c031d0d693e0075666ef7463b7e04441f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31700_c4116526373a2bb94aea.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31700_e49a65fbb32b247d05d7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31700_e49a65fbb32b247d05d7.png
new file mode 100644
index 0000000000000000000000000000000000000000..cff84327f343a74bfed9756b85f6b53a8aa7949d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31700_e49a65fbb32b247d05d7.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31700_f0f91e666d162164e07f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31700_f0f91e666d162164e07f.png
new file mode 100644
index 0000000000000000000000000000000000000000..6024b7a137ab015f35fa7803f636068ac9aff033
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31700_f0f91e666d162164e07f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31900_14dc2ff1b2bdb4f683d3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31900_14dc2ff1b2bdb4f683d3.png
new file mode 100644
index 0000000000000000000000000000000000000000..79af1e9d5aa58ace0f583da7c51f3afc89ed4214
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31900_14dc2ff1b2bdb4f683d3.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31900_1fe7cea5e338dff63ee4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31900_1fe7cea5e338dff63ee4.png
new file mode 100644
index 0000000000000000000000000000000000000000..3152f4772b01a53abe8e8b03016c6239725a2d1b
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31900_1fe7cea5e338dff63ee4.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31900_5b4353aae8680d942807.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31900_5b4353aae8680d942807.png
new file mode 100644
index 0000000000000000000000000000000000000000..ff876c4071c26980c072668285883d3ce7c342ad
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31900_5b4353aae8680d942807.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31900_7cf10f8aaaa4f9cdeca7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31900_7cf10f8aaaa4f9cdeca7.png
new file mode 100644
index 0000000000000000000000000000000000000000..e35dcb36a43215e31e04922fd834365addf71c33
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_31900_7cf10f8aaaa4f9cdeca7.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32100_0f6815a0065ce96e4dde.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32100_0f6815a0065ce96e4dde.png
new file mode 100644
index 0000000000000000000000000000000000000000..e63bfe6d2c2a9fa8b0fe6c1f5439852ce8d3c9a8
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32100_0f6815a0065ce96e4dde.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32100_7fce910bbcece1f0aaca.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32100_7fce910bbcece1f0aaca.png
new file mode 100644
index 0000000000000000000000000000000000000000..c5e55ffb1a3f0501e472d073de2357b736d01c60
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32100_7fce910bbcece1f0aaca.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32100_8a18901630a08c126174.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32100_8a18901630a08c126174.png
new file mode 100644
index 0000000000000000000000000000000000000000..f8ceb7436d4e7c65a19392451900ef939ba2f8f4
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32100_8a18901630a08c126174.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8a18901630a08c1261747f48193f0ad4dc18864684592bc800979a8e5c486177
+size 133587
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32100_f0a57f5937691be54892.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32100_f0a57f5937691be54892.png
new file mode 100644
index 0000000000000000000000000000000000000000..004a1c6e2d4bb0d2e2ac6921bde6efa8fcbf5921
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32100_f0a57f5937691be54892.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32300_1b37d5e95e235e197856.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32300_1b37d5e95e235e197856.png
new file mode 100644
index 0000000000000000000000000000000000000000..eb0f0f89e140c5eb1cc3d1867eead096847d8ce0
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32300_1b37d5e95e235e197856.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32300_4722ca62c4c043c22cc9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32300_4722ca62c4c043c22cc9.png
new file mode 100644
index 0000000000000000000000000000000000000000..cddf53711e1643673c249958c7ddbe555377221b
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32300_4722ca62c4c043c22cc9.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32300_9384edca309777df6d63.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32300_9384edca309777df6d63.png
new file mode 100644
index 0000000000000000000000000000000000000000..7f37940a005690f4436ce6843f656e87ad4125ee
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32300_9384edca309777df6d63.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32300_a41690a18f8798ce6195.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32300_a41690a18f8798ce6195.png
new file mode 100644
index 0000000000000000000000000000000000000000..c20ea1c49ad0f170c8978328ca9552dbaa4d71fd
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32300_a41690a18f8798ce6195.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32500_3debd537a5f1c482b91d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32500_3debd537a5f1c482b91d.png
new file mode 100644
index 0000000000000000000000000000000000000000..153a63e82ee70e64d7bc83052ae37c8aafb88175
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32500_3debd537a5f1c482b91d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32500_47145b286591e87e9bf7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32500_47145b286591e87e9bf7.png
new file mode 100644
index 0000000000000000000000000000000000000000..69c6f7f9234ac24b524b2713b6918ff21962944e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32500_47145b286591e87e9bf7.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32500_d56267a80d11d45afa22.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32500_d56267a80d11d45afa22.png
new file mode 100644
index 0000000000000000000000000000000000000000..f2e3e8f83d27da2cbadb4fcc8d7794c112b9f95a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32500_d56267a80d11d45afa22.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32500_fd1b6eb1a7439320b367.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32500_fd1b6eb1a7439320b367.png
new file mode 100644
index 0000000000000000000000000000000000000000..3bcc40a1519c9d9f00f38ace9075232a2a6809d5
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32500_fd1b6eb1a7439320b367.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32700_038ce5162023c1e48b07.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32700_038ce5162023c1e48b07.png
new file mode 100644
index 0000000000000000000000000000000000000000..a23fb792a1a74fdbb61d960a68c46e09c2e40a59
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32700_038ce5162023c1e48b07.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32700_12b8a5869281d19766eb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32700_12b8a5869281d19766eb.png
new file mode 100644
index 0000000000000000000000000000000000000000..2e9d162779a30d74067b18a324de38d34deb89c1
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32700_12b8a5869281d19766eb.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32700_5ab64418d826bc97fd05.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32700_5ab64418d826bc97fd05.png
new file mode 100644
index 0000000000000000000000000000000000000000..4882f667d155e059a659e51adf29e82175ef2a93
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32700_5ab64418d826bc97fd05.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32700_89b2333b48079b7ea90c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32700_89b2333b48079b7ea90c.png
new file mode 100644
index 0000000000000000000000000000000000000000..df25f50bf2b0772069e8b2e859d7c32b05f0b7c3
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32700_89b2333b48079b7ea90c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32900_3cbba2f6553ac420f68e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32900_3cbba2f6553ac420f68e.png
new file mode 100644
index 0000000000000000000000000000000000000000..715835f016ae268abeea9600a7e4ba458fdd48d5
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32900_3cbba2f6553ac420f68e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32900_468c3c674ec63c5b409b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32900_468c3c674ec63c5b409b.png
new file mode 100644
index 0000000000000000000000000000000000000000..c62446bb3847f93c54d3cd4b05087b1b886ee3b3
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32900_468c3c674ec63c5b409b.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32900_da334b6b9d841bfc3e64.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32900_da334b6b9d841bfc3e64.png
new file mode 100644
index 0000000000000000000000000000000000000000..74c0ea344b82d51fec41a176a8f1058c5d4f4efa
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32900_da334b6b9d841bfc3e64.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32900_de321a2f099af2919940.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32900_de321a2f099af2919940.png
new file mode 100644
index 0000000000000000000000000000000000000000..6d7fae0caf2bb0e6137ad98452e629b7070df951
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_32900_de321a2f099af2919940.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3300_35492b6fa86970ae5c92.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3300_35492b6fa86970ae5c92.png
new file mode 100644
index 0000000000000000000000000000000000000000..93ba81fe9b9b51bb8a450a5518688edf62b1b464
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3300_35492b6fa86970ae5c92.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3300_3c4f2c58f59e3efee406.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3300_3c4f2c58f59e3efee406.png
new file mode 100644
index 0000000000000000000000000000000000000000..d42d920287071c82bda7fc0f8ef1f92e5af3df89
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3300_3c4f2c58f59e3efee406.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3300_fe7208a6fa767da931a4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3300_fe7208a6fa767da931a4.png
new file mode 100644
index 0000000000000000000000000000000000000000..997bd24d6113c28dcdbfe48aa296ce820feba804
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3300_fe7208a6fa767da931a4.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3300_fedd7aef10ce6cd9ce24.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3300_fedd7aef10ce6cd9ce24.png
new file mode 100644
index 0000000000000000000000000000000000000000..823f5b9183cbaab2f4abe285b03d0029f55a36a1
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3300_fedd7aef10ce6cd9ce24.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33100_15134d51b6126e262534.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33100_15134d51b6126e262534.png
new file mode 100644
index 0000000000000000000000000000000000000000..d446a2cab0de7d2afaf379acb5ac4fefb279feac
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33100_15134d51b6126e262534.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33100_1cc1c580ae4bb9f9d098.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33100_1cc1c580ae4bb9f9d098.png
new file mode 100644
index 0000000000000000000000000000000000000000..c9d952142a7c91f403165e93623b1e223b0f3f49
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33100_1cc1c580ae4bb9f9d098.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33100_3bec72f9e8da03e27532.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33100_3bec72f9e8da03e27532.png
new file mode 100644
index 0000000000000000000000000000000000000000..33e066375e54b6e24c6fa4029c365fd8c6bdf3f4
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33100_3bec72f9e8da03e27532.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33100_60b1b190353efb393ad8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33100_60b1b190353efb393ad8.png
new file mode 100644
index 0000000000000000000000000000000000000000..1ed728a265fdb6be384cd4666d1faa6cee202a25
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33100_60b1b190353efb393ad8.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33300_26f5e2a1a257bda996be.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33300_26f5e2a1a257bda996be.png
new file mode 100644
index 0000000000000000000000000000000000000000..d2f7a8683770cd58287bd8c7d1f18ebc93f9fcb1
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33300_26f5e2a1a257bda996be.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33300_7f37009e10455c416da1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33300_7f37009e10455c416da1.png
new file mode 100644
index 0000000000000000000000000000000000000000..1045921b5361df00fd9a5b1e6384c7985f10d793
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33300_7f37009e10455c416da1.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33300_cd43538da78b6a8eab8d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33300_cd43538da78b6a8eab8d.png
new file mode 100644
index 0000000000000000000000000000000000000000..01122c41bfe9109d66ebb1bf88162049f816cfbc
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33300_cd43538da78b6a8eab8d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33300_e131e108fac33f453c6f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33300_e131e108fac33f453c6f.png
new file mode 100644
index 0000000000000000000000000000000000000000..55f1447f3a12f3df4a181fd5d3023e11b74ab064
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33300_e131e108fac33f453c6f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33500_185a43d45a67f04fb600.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33500_185a43d45a67f04fb600.png
new file mode 100644
index 0000000000000000000000000000000000000000..7b44d384519680c43c87489a176f4a0eddad7ea2
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33500_185a43d45a67f04fb600.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33500_27822d7de57db4423c75.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33500_27822d7de57db4423c75.png
new file mode 100644
index 0000000000000000000000000000000000000000..b97fc730b52905791f6148e3110503a1bf374668
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33500_27822d7de57db4423c75.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33500_eb6c0987626369f8e5a9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33500_eb6c0987626369f8e5a9.png
new file mode 100644
index 0000000000000000000000000000000000000000..8eec2f05a526afc94e99b5ba42de1f234779c32c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33500_eb6c0987626369f8e5a9.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33500_ed8d3c331aa61b5133f1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33500_ed8d3c331aa61b5133f1.png
new file mode 100644
index 0000000000000000000000000000000000000000..c41333d78afc740e2fc36fa33c566aca95d730dc
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33500_ed8d3c331aa61b5133f1.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33700_05668f78473c552e0a7e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33700_05668f78473c552e0a7e.png
new file mode 100644
index 0000000000000000000000000000000000000000..20b65c93c1b28b140c19e2ad4153385610073a8c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33700_05668f78473c552e0a7e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33700_73a570ace242c50f07b6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33700_73a570ace242c50f07b6.png
new file mode 100644
index 0000000000000000000000000000000000000000..a6e1d2e85b10a88b0dc5cd21210da217a6b0df97
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33700_73a570ace242c50f07b6.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33700_b22d5196276e8dab5e40.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33700_b22d5196276e8dab5e40.png
new file mode 100644
index 0000000000000000000000000000000000000000..2f0d2039db70e43a35048e09721514895633e9ee
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33700_b22d5196276e8dab5e40.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33700_b597ef84873132946233.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33700_b597ef84873132946233.png
new file mode 100644
index 0000000000000000000000000000000000000000..7a3290f400eb97b7bdb9dde895e49fb6bd9290aa
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33700_b597ef84873132946233.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33900_26df7b84f3ed6ba74bc7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33900_26df7b84f3ed6ba74bc7.png
new file mode 100644
index 0000000000000000000000000000000000000000..dd09e9ddcfb9f22aa0c3bf884fafa6e1a43d6c68
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33900_26df7b84f3ed6ba74bc7.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33900_8db17fb9a6a20d6db697.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33900_8db17fb9a6a20d6db697.png
new file mode 100644
index 0000000000000000000000000000000000000000..5b358a53cfdf82af587e62e84784338f26e4a7c6
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33900_8db17fb9a6a20d6db697.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33900_dd4a6a953dbb6de18a2f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33900_dd4a6a953dbb6de18a2f.png
new file mode 100644
index 0000000000000000000000000000000000000000..5d7a29657f4c53432e88c588c6d4fb1583a06490
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33900_dd4a6a953dbb6de18a2f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33900_faa1b6e43f161b54312d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33900_faa1b6e43f161b54312d.png
new file mode 100644
index 0000000000000000000000000000000000000000..cb020125737bebc2fbd2d96e77c2af8416847f2e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_33900_faa1b6e43f161b54312d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34100_5e145356588cf8768984.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34100_5e145356588cf8768984.png
new file mode 100644
index 0000000000000000000000000000000000000000..dc7c7378843e4fd6e45211a1047a0fff26966dc1
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34100_5e145356588cf8768984.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34100_7379eb12abfe84d2551e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34100_7379eb12abfe84d2551e.png
new file mode 100644
index 0000000000000000000000000000000000000000..b3950c8febbd72fece184f83cc2015c4d0b76bf7
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34100_7379eb12abfe84d2551e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34100_c94e6a912b996a7f22e6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34100_c94e6a912b996a7f22e6.png
new file mode 100644
index 0000000000000000000000000000000000000000..71e922f889a5eda614883e55a91b4c665fc26ef3
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34100_c94e6a912b996a7f22e6.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34100_e79c03bca38c4ef91eee.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34100_e79c03bca38c4ef91eee.png
new file mode 100644
index 0000000000000000000000000000000000000000..29c9b0cf5c4989e7862480de61e9f8e04ae4c515
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34100_e79c03bca38c4ef91eee.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e79c03bca38c4ef91eeee985e45bcd7c66c6f00adad4b64f927b63095bce9d92
+size 109165
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34300_1f36305c05838005b6ae.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34300_1f36305c05838005b6ae.png
new file mode 100644
index 0000000000000000000000000000000000000000..baa2e9f8e268e18d4722a690961a873fa93549dc
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34300_1f36305c05838005b6ae.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34300_7099d7b8850e50bd335f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34300_7099d7b8850e50bd335f.png
new file mode 100644
index 0000000000000000000000000000000000000000..cdbb1386fa19d5675990b0479b71ca233ef473b6
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34300_7099d7b8850e50bd335f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34300_8c11eecccd4a531d1c61.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34300_8c11eecccd4a531d1c61.png
new file mode 100644
index 0000000000000000000000000000000000000000..96b21cc7cd2af97148c7606d72fb9929c2047ac1
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34300_8c11eecccd4a531d1c61.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8c11eecccd4a531d1c6189d792dd07984a2106b293bf196ca6663e12924310d0
+size 137487
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34300_f4a2ebcafe2a60af7683.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34300_f4a2ebcafe2a60af7683.png
new file mode 100644
index 0000000000000000000000000000000000000000..a5b1f7f05fda3127862a73a63738b6ad324f4594
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34300_f4a2ebcafe2a60af7683.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34500_4a0848c78f7cf8908ac5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34500_4a0848c78f7cf8908ac5.png
new file mode 100644
index 0000000000000000000000000000000000000000..f0b5d765aa16c2d7d86ab928563add9b06dfb5f5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34500_4a0848c78f7cf8908ac5.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4a0848c78f7cf8908ac5c5902cafa886c781ca5a7ea5a92a2df0a470409049cf
+size 120008
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34500_645ea8333e42a9ceddd5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34500_645ea8333e42a9ceddd5.png
new file mode 100644
index 0000000000000000000000000000000000000000..f54d92928d7eafe476abebb8bbdb8cc4a091a887
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34500_645ea8333e42a9ceddd5.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34500_c9261246110c059005ef.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34500_c9261246110c059005ef.png
new file mode 100644
index 0000000000000000000000000000000000000000..530cce5b94cf63eafa10e263391f48262da08e1e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34500_c9261246110c059005ef.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34500_eab798ffd2bb0e6ad698.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34500_eab798ffd2bb0e6ad698.png
new file mode 100644
index 0000000000000000000000000000000000000000..68657bb5e83303a9f9db750a8d9428ac418c23da
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34500_eab798ffd2bb0e6ad698.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34700_31211772486ca520158d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34700_31211772486ca520158d.png
new file mode 100644
index 0000000000000000000000000000000000000000..8ec2c2c665c961bf9f832428bf4a13146c13aa3e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34700_31211772486ca520158d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34700_80343e5c211947c92074.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34700_80343e5c211947c92074.png
new file mode 100644
index 0000000000000000000000000000000000000000..72fbb1223159bc00599ef78d129daec66b51ba1f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34700_80343e5c211947c92074.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:80343e5c211947c920749a5fc76a39a37499b1868fdff0f835ba521e9a258caa
+size 127797
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34700_82516c8d242b401e082a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34700_82516c8d242b401e082a.png
new file mode 100644
index 0000000000000000000000000000000000000000..fa7bec86d47143c3d73edb7cce1a44429e95ac2e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34700_82516c8d242b401e082a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34700_b2a977a9eb9abfa91a5e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34700_b2a977a9eb9abfa91a5e.png
new file mode 100644
index 0000000000000000000000000000000000000000..df2e8235219d3bf0f87cadebc280fe2f0e482a64
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34700_b2a977a9eb9abfa91a5e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34900_4f8ed4d8b540a3961535.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34900_4f8ed4d8b540a3961535.png
new file mode 100644
index 0000000000000000000000000000000000000000..5a065ef797804215e7fc2ca105af7627fb195536
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34900_4f8ed4d8b540a3961535.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34900_66ad3db260181255aaa2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34900_66ad3db260181255aaa2.png
new file mode 100644
index 0000000000000000000000000000000000000000..5493bf41e952ed38dd1c21778de328ebbd81207c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34900_66ad3db260181255aaa2.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34900_6776734b5c77485c8ea5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34900_6776734b5c77485c8ea5.png
new file mode 100644
index 0000000000000000000000000000000000000000..c33552f845934ee7f02e615b581a464043a14441
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34900_6776734b5c77485c8ea5.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6776734b5c77485c8ea51ca52fcb3193ec154548184d9817a44d146cd870ace3
+size 112010
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34900_76dfb286cc33f6c5ec56.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34900_76dfb286cc33f6c5ec56.png
new file mode 100644
index 0000000000000000000000000000000000000000..0791d53c9f3e93d2a09f545705ca322fef874b32
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_34900_76dfb286cc33f6c5ec56.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3500_23938ddf4ea68feb89ac.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3500_23938ddf4ea68feb89ac.png
new file mode 100644
index 0000000000000000000000000000000000000000..d7f701ee7ab6498ffcd55f31e862cc3b86ba4cc4
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3500_23938ddf4ea68feb89ac.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3500_903263541cd7828fdc83.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3500_903263541cd7828fdc83.png
new file mode 100644
index 0000000000000000000000000000000000000000..441d5dce8a7fe8e27267864cca0a9bc845f2606f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3500_903263541cd7828fdc83.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3500_909be3e175762d749f36.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3500_909be3e175762d749f36.png
new file mode 100644
index 0000000000000000000000000000000000000000..d4b91c0c89e4f395f15ff5ed4ecff3588a695cac
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3500_909be3e175762d749f36.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3500_e278f5b335a0adcd41ff.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3500_e278f5b335a0adcd41ff.png
new file mode 100644
index 0000000000000000000000000000000000000000..3263cc464e119b086c2683bb40caae0c757ccc1f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3500_e278f5b335a0adcd41ff.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35100_22e530a267bfad949217.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35100_22e530a267bfad949217.png
new file mode 100644
index 0000000000000000000000000000000000000000..6f12d83500000e5441f39af800bbb1e7e0eb3e93
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35100_22e530a267bfad949217.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35100_3010b81c0f8522369105.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35100_3010b81c0f8522369105.png
new file mode 100644
index 0000000000000000000000000000000000000000..594d736554903d918cbfad12c83a6af433dabfd5
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35100_3010b81c0f8522369105.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35100_79bb057c50fa0c922b4e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35100_79bb057c50fa0c922b4e.png
new file mode 100644
index 0000000000000000000000000000000000000000..7e3358766a946474e43866b1285b92eb193bf318
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35100_79bb057c50fa0c922b4e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:79bb057c50fa0c922b4ed1430e7d8c845def49cc18b842b4a9f8e2a7715541fc
+size 106968
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35100_b4305e67c6ed13353ff7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35100_b4305e67c6ed13353ff7.png
new file mode 100644
index 0000000000000000000000000000000000000000..5f13ef99a148241c2eb7df1ed4cfc0af72ecedf5
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35100_b4305e67c6ed13353ff7.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35300_734cb98b3de7bf66587a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35300_734cb98b3de7bf66587a.png
new file mode 100644
index 0000000000000000000000000000000000000000..9edaf5b076b2583c7690591e60fb5151c26f05df
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35300_734cb98b3de7bf66587a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:734cb98b3de7bf66587a4fefe2030c5d01300639b6b01e59dabbb2fe73bd2d6f
+size 139712
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35300_8b4c47f46a80ce3616be.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35300_8b4c47f46a80ce3616be.png
new file mode 100644
index 0000000000000000000000000000000000000000..3157813c2980b36cb854cbae4b88395acddecb21
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35300_8b4c47f46a80ce3616be.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35300_9bd067916caa5ee03be6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35300_9bd067916caa5ee03be6.png
new file mode 100644
index 0000000000000000000000000000000000000000..33653bee2482679fa6333a40232003973b874aa4
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35300_9bd067916caa5ee03be6.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35300_b47a900d34ef2ac72a4b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35300_b47a900d34ef2ac72a4b.png
new file mode 100644
index 0000000000000000000000000000000000000000..a804f4950ebd5bf131f0f2fd5f294cd291f5880c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35300_b47a900d34ef2ac72a4b.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35500_3248bac070573aaa7c41.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35500_3248bac070573aaa7c41.png
new file mode 100644
index 0000000000000000000000000000000000000000..47a2c675061d31c4e966b5afb816f4c01a5eee4f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35500_3248bac070573aaa7c41.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35500_4a8f494b66f2f87b02fb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35500_4a8f494b66f2f87b02fb.png
new file mode 100644
index 0000000000000000000000000000000000000000..c6981b94961d596c737d9cd4803d4a261f3391c9
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35500_4a8f494b66f2f87b02fb.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35500_6acd2044ad4f38b131ba.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35500_6acd2044ad4f38b131ba.png
new file mode 100644
index 0000000000000000000000000000000000000000..8c2ad0cfe8c0f469400fcf4c0c9997190bd527c1
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35500_6acd2044ad4f38b131ba.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35500_e8898f3a337b63a4b1cf.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35500_e8898f3a337b63a4b1cf.png
new file mode 100644
index 0000000000000000000000000000000000000000..92785d79c09fd2c7f88c597f4555b82a82538ec8
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35500_e8898f3a337b63a4b1cf.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35700_345621e33f6b5cfd0260.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35700_345621e33f6b5cfd0260.png
new file mode 100644
index 0000000000000000000000000000000000000000..aa32716926ea56cb0e42ec3606adb3ce32b3a194
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35700_345621e33f6b5cfd0260.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:345621e33f6b5cfd0260888224832f47524221eb96bff640437e5b32bfc7980f
+size 127337
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35700_5bbb30190ffcc21dd8d9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35700_5bbb30190ffcc21dd8d9.png
new file mode 100644
index 0000000000000000000000000000000000000000..0f88db80b72ba553a55acc7d40e0f7e22a9b01fe
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35700_5bbb30190ffcc21dd8d9.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35700_eca0f28425457bfce6a5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35700_eca0f28425457bfce6a5.png
new file mode 100644
index 0000000000000000000000000000000000000000..df1bfe4720836d164c0141ab21033c34bba458ae
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35700_eca0f28425457bfce6a5.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35700_fa3c1d45dca7451b59d4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35700_fa3c1d45dca7451b59d4.png
new file mode 100644
index 0000000000000000000000000000000000000000..3b832374353671eee1b1de4194ed13f2ec5970d4
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35700_fa3c1d45dca7451b59d4.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35900_33b21de11fecca0464cb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35900_33b21de11fecca0464cb.png
new file mode 100644
index 0000000000000000000000000000000000000000..55a0336b8f5b531e879de8cee8b38e547ac1e973
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35900_33b21de11fecca0464cb.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35900_4d85a07c4da8965b30e3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35900_4d85a07c4da8965b30e3.png
new file mode 100644
index 0000000000000000000000000000000000000000..d6d3dd896400cf4fe39e6990f8ea2543ec605dc5
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35900_4d85a07c4da8965b30e3.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35900_b3954305447f5d63f085.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35900_b3954305447f5d63f085.png
new file mode 100644
index 0000000000000000000000000000000000000000..eb9639029167add02ff916a8adb8599abc9dab4f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35900_b3954305447f5d63f085.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35900_cbe611b89f3cca806972.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35900_cbe611b89f3cca806972.png
new file mode 100644
index 0000000000000000000000000000000000000000..fae722140abe90041c18ac3e7c34432a1a952dd0
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_35900_cbe611b89f3cca806972.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36100_40e07a833c009819c9b9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36100_40e07a833c009819c9b9.png
new file mode 100644
index 0000000000000000000000000000000000000000..6a86acfd0043427b400955abdd9608c71aa86a9a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36100_40e07a833c009819c9b9.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36100_ab82a83cc25da24532f8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36100_ab82a83cc25da24532f8.png
new file mode 100644
index 0000000000000000000000000000000000000000..7ec979229f8eda0145e8ec0eff5f2e821b668f12
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36100_ab82a83cc25da24532f8.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ab82a83cc25da24532f8e96690c36bc1339014f203a0e70e4b1bd45105dae7f4
+size 111588
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36100_b7fe2e7424ae721c5fb5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36100_b7fe2e7424ae721c5fb5.png
new file mode 100644
index 0000000000000000000000000000000000000000..2282c23b5653e4759b67104d59ec82eb55db6d2d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36100_b7fe2e7424ae721c5fb5.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36100_bdbd5f5a9964f20e3c17.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36100_bdbd5f5a9964f20e3c17.png
new file mode 100644
index 0000000000000000000000000000000000000000..a03672a812a6e82d127b2a1ada0c1b44fb6707fd
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36100_bdbd5f5a9964f20e3c17.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36300_2d68af26df618ecda8ee.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36300_2d68af26df618ecda8ee.png
new file mode 100644
index 0000000000000000000000000000000000000000..347bb1a1374b6cd12f8031485311ade9fd5b8021
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36300_2d68af26df618ecda8ee.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2d68af26df618ecda8ee603af583f6c65f7189aef360065add034172cf4c4ba3
+size 108794
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36300_4fa1a24663eeb30eb703.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36300_4fa1a24663eeb30eb703.png
new file mode 100644
index 0000000000000000000000000000000000000000..3f485c2cbdd62c46b5e8ebd8d37ffcb5f8faee2c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36300_4fa1a24663eeb30eb703.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36300_54351548956235acbcf6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36300_54351548956235acbcf6.png
new file mode 100644
index 0000000000000000000000000000000000000000..accea300f0fc1bed217dd4124f72d6cc413cde5b
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36300_54351548956235acbcf6.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36300_bfc694c47e1b40aeac1d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36300_bfc694c47e1b40aeac1d.png
new file mode 100644
index 0000000000000000000000000000000000000000..03a439e9f974975d921ec1d7bc1b93e49a1dccfd
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36300_bfc694c47e1b40aeac1d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:bfc694c47e1b40aeac1dcb0d320e962d92ad8dd6ca45dd0cf364cb2743c3e59e
+size 107270
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36500_62979b177d9eac4869ab.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36500_62979b177d9eac4869ab.png
new file mode 100644
index 0000000000000000000000000000000000000000..06e8116fb8faa07233e98a400bb634d40c45fc7e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36500_62979b177d9eac4869ab.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36500_b907dfba514c9ac40e81.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36500_b907dfba514c9ac40e81.png
new file mode 100644
index 0000000000000000000000000000000000000000..238fa9303f2437a30ce4d147ac2584c292690301
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36500_b907dfba514c9ac40e81.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36500_df0e542b5ab7b57ee50f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36500_df0e542b5ab7b57ee50f.png
new file mode 100644
index 0000000000000000000000000000000000000000..76dc201ab1066c857931411c22524df76be9dd0a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36500_df0e542b5ab7b57ee50f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36500_e06b1dba21f0e09d6aec.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36500_e06b1dba21f0e09d6aec.png
new file mode 100644
index 0000000000000000000000000000000000000000..c091ed06e4980c59d06faaf65b2dcf691f649ec9
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36500_e06b1dba21f0e09d6aec.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36700_080daf370f30a82657ab.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36700_080daf370f30a82657ab.png
new file mode 100644
index 0000000000000000000000000000000000000000..5899dd642438176d63b9904fba5aa602a1c33c5f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36700_080daf370f30a82657ab.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36700_4c07efd30b72af048ed9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36700_4c07efd30b72af048ed9.png
new file mode 100644
index 0000000000000000000000000000000000000000..d643b0106528f78033f634c8a3b81484f2c2fd10
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36700_4c07efd30b72af048ed9.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36700_aa8e4c38d6c5ec76027d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36700_aa8e4c38d6c5ec76027d.png
new file mode 100644
index 0000000000000000000000000000000000000000..d5a5ef52fb4f2256926a44724df0c242ffb53bcf
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36700_aa8e4c38d6c5ec76027d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36700_d484e49105a5859b3b64.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36700_d484e49105a5859b3b64.png
new file mode 100644
index 0000000000000000000000000000000000000000..6575fee04307b0a3a0e6a626573df6b87c710b38
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36700_d484e49105a5859b3b64.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36900_80a656fb8f9abed9470e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36900_80a656fb8f9abed9470e.png
new file mode 100644
index 0000000000000000000000000000000000000000..0f60a8ae46a1d0042243c16f449e174a37dd4d2c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36900_80a656fb8f9abed9470e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36900_950af257c72ceb59aff9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36900_950af257c72ceb59aff9.png
new file mode 100644
index 0000000000000000000000000000000000000000..1a0cd508a784860919c9dae24cf2f85f89857ee2
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36900_950af257c72ceb59aff9.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36900_96a7a493ae60d5f63dff.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36900_96a7a493ae60d5f63dff.png
new file mode 100644
index 0000000000000000000000000000000000000000..82db044abacadebfd1e1fd8b4ea6e3b285042d89
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36900_96a7a493ae60d5f63dff.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:96a7a493ae60d5f63dff146112fb26d51c35dc43118c6cedf70fb54fc210ec24
+size 108752
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36900_b45d98e2b08d5a2708ab.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36900_b45d98e2b08d5a2708ab.png
new file mode 100644
index 0000000000000000000000000000000000000000..ea77a26dbc8def207ba67fdfd2ce5e3f4c4dcb18
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_36900_b45d98e2b08d5a2708ab.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3700_5583d1307f65ce1a730f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3700_5583d1307f65ce1a730f.png
new file mode 100644
index 0000000000000000000000000000000000000000..a05a92b90727b58dc1a8628693f14a78b21b2e94
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3700_5583d1307f65ce1a730f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3700_6d213d786334ee7d9de5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3700_6d213d786334ee7d9de5.png
new file mode 100644
index 0000000000000000000000000000000000000000..f8160ad8e0285174c229c04efa302e402ec26671
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3700_6d213d786334ee7d9de5.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3700_9842a977211ca3a074b9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3700_9842a977211ca3a074b9.png
new file mode 100644
index 0000000000000000000000000000000000000000..50eed8090f5721023f0a14470f01dede8762d12f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3700_9842a977211ca3a074b9.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3700_bfbf39b46b324543a553.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3700_bfbf39b46b324543a553.png
new file mode 100644
index 0000000000000000000000000000000000000000..86a884528d3538755eea7c7c2fe791a7dff25c2f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3700_bfbf39b46b324543a553.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37100_53433807c45629157f0e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37100_53433807c45629157f0e.png
new file mode 100644
index 0000000000000000000000000000000000000000..971ba386e6da3d164409b5830d5dc4d11f6b7239
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37100_53433807c45629157f0e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37100_64ed89ba4435ed400703.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37100_64ed89ba4435ed400703.png
new file mode 100644
index 0000000000000000000000000000000000000000..7a9e09d3e267a3205600c1fea776b33280bdf2d6
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37100_64ed89ba4435ed400703.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37100_964fa403ce1f50a3c83b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37100_964fa403ce1f50a3c83b.png
new file mode 100644
index 0000000000000000000000000000000000000000..ea5e6ec1fdd9e4aaaaee59fa19fa6df0aa8a8edf
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37100_964fa403ce1f50a3c83b.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37100_f2db108c6a62397c5170.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37100_f2db108c6a62397c5170.png
new file mode 100644
index 0000000000000000000000000000000000000000..887be2e8f7a4d6646d5b7f3f9ab6dcf1a85f5250
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37100_f2db108c6a62397c5170.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37300_05024c8b6d9a0c935ed3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37300_05024c8b6d9a0c935ed3.png
new file mode 100644
index 0000000000000000000000000000000000000000..cc95bb370fdfe5603e009462ce087f44b203d1b3
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37300_05024c8b6d9a0c935ed3.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37300_684b5380900faf5ec434.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37300_684b5380900faf5ec434.png
new file mode 100644
index 0000000000000000000000000000000000000000..8b3ee97d342fc737bb8188267e2a53ced9b2220e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37300_684b5380900faf5ec434.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37300_697ec168d1bec6112a6a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37300_697ec168d1bec6112a6a.png
new file mode 100644
index 0000000000000000000000000000000000000000..90dc3fa3fc71752cbbb669449b31f25d9d053528
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37300_697ec168d1bec6112a6a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37300_ef46905bd01745822d9a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37300_ef46905bd01745822d9a.png
new file mode 100644
index 0000000000000000000000000000000000000000..8d3d26e7f1a332a6f1ea256b89473ffa97c38667
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37300_ef46905bd01745822d9a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37500_340bfb4306119ed0d611.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37500_340bfb4306119ed0d611.png
new file mode 100644
index 0000000000000000000000000000000000000000..84aeee36fa6c1b18fe270512d5a967c2ce990d95
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37500_340bfb4306119ed0d611.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37500_b935a9a06c24e273fa94.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37500_b935a9a06c24e273fa94.png
new file mode 100644
index 0000000000000000000000000000000000000000..c402405d1aed93475fa4933758233079a7fe3deb
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37500_b935a9a06c24e273fa94.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37500_c66b3cd9616340bc1aa4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37500_c66b3cd9616340bc1aa4.png
new file mode 100644
index 0000000000000000000000000000000000000000..364beeac82c341eb59866681268680069a8e5674
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37500_c66b3cd9616340bc1aa4.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37500_fe9b68c7ad50cbdf7d09.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37500_fe9b68c7ad50cbdf7d09.png
new file mode 100644
index 0000000000000000000000000000000000000000..945f02d85648d0abca8d3389dfd617eaf047391b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37500_fe9b68c7ad50cbdf7d09.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:fe9b68c7ad50cbdf7d097425d7d143e98b860db5a98ce8412c12153ff826ef27
+size 147916
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37700_859722a8eda79fff0917.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37700_859722a8eda79fff0917.png
new file mode 100644
index 0000000000000000000000000000000000000000..6e51774a61933bbe4bee10a4f464014046bfd342
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37700_859722a8eda79fff0917.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37700_98314b3e919bba7ed12f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37700_98314b3e919bba7ed12f.png
new file mode 100644
index 0000000000000000000000000000000000000000..e3b493c52280ac4d3da6ad335ba5959a38ceb6d8
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37700_98314b3e919bba7ed12f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37700_a3d84d403e257f6ddcff.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37700_a3d84d403e257f6ddcff.png
new file mode 100644
index 0000000000000000000000000000000000000000..8ca5e4c9207dd99d5ecefb02c3cf5bd6f94c5d20
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37700_a3d84d403e257f6ddcff.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a3d84d403e257f6ddcff81d2f88e483e7b2b236763f57d2f51bc5093a095f50f
+size 119640
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37700_c8f4f9cbd1e1e781c03e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37700_c8f4f9cbd1e1e781c03e.png
new file mode 100644
index 0000000000000000000000000000000000000000..f90024eea79430fc0bc4786b21b301a7b1006c8d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37700_c8f4f9cbd1e1e781c03e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37900_4e8dc75a724137e58088.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37900_4e8dc75a724137e58088.png
new file mode 100644
index 0000000000000000000000000000000000000000..2f1363724d7478a9febf851206a14e1eaa66526b
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37900_4e8dc75a724137e58088.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37900_8168bd8e8e1b8c8235f6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37900_8168bd8e8e1b8c8235f6.png
new file mode 100644
index 0000000000000000000000000000000000000000..098e264ccd0356b2ae0af9675acaafe366fa3f6a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37900_8168bd8e8e1b8c8235f6.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37900_823ceb75306ca80f13a1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37900_823ceb75306ca80f13a1.png
new file mode 100644
index 0000000000000000000000000000000000000000..ad07d6dd990aaf30e3987ea672b7c22163942bc9
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37900_823ceb75306ca80f13a1.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37900_dc9ea3cd25b6617aa538.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37900_dc9ea3cd25b6617aa538.png
new file mode 100644
index 0000000000000000000000000000000000000000..0f4c9478d9a0f79d06a37d225fd6fd8eb3bc966e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_37900_dc9ea3cd25b6617aa538.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3900_25360013b79733d1e038.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3900_25360013b79733d1e038.png
new file mode 100644
index 0000000000000000000000000000000000000000..c2804d943aa66e0d4b095c5348adc6b4395fba7c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3900_25360013b79733d1e038.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3900_5267fa2c5bd09ec9a4ec.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3900_5267fa2c5bd09ec9a4ec.png
new file mode 100644
index 0000000000000000000000000000000000000000..b980c94045b12f5fef81d1440cbe944669c05a54
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3900_5267fa2c5bd09ec9a4ec.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3900_84489ea76079ea0d6aa7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3900_84489ea76079ea0d6aa7.png
new file mode 100644
index 0000000000000000000000000000000000000000..83bfc4e9710f9bcf0b96c0f9a0be2ba4a773243c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3900_84489ea76079ea0d6aa7.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3900_cf786d48500c0f71c4a3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3900_cf786d48500c0f71c4a3.png
new file mode 100644
index 0000000000000000000000000000000000000000..696ab3b20de009205ef8b0d394c44e0e45f16131
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_3900_cf786d48500c0f71c4a3.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4100_28e4f5e882ea6d9ab4d1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4100_28e4f5e882ea6d9ab4d1.png
new file mode 100644
index 0000000000000000000000000000000000000000..5a2a5a8314a85b40bca249fcd1493489f0773e4d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4100_28e4f5e882ea6d9ab4d1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:28e4f5e882ea6d9ab4d137cfbb906587f1475deb79c61289e644769c6fb10dd2
+size 108578
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4100_496177a72c3abf79b886.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4100_496177a72c3abf79b886.png
new file mode 100644
index 0000000000000000000000000000000000000000..c3ec3c83447a5282b5cfe695dc6cd7942eafea9e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4100_496177a72c3abf79b886.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4100_7f73bdca9bd8251f93c2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4100_7f73bdca9bd8251f93c2.png
new file mode 100644
index 0000000000000000000000000000000000000000..06d041cd10c09acbcbf917a15518bb73d06289d5
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4100_7f73bdca9bd8251f93c2.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4100_b913e9efae27af53c529.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4100_b913e9efae27af53c529.png
new file mode 100644
index 0000000000000000000000000000000000000000..4bfc9e44457365223cb9099c2743ba27f83c00ec
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4100_b913e9efae27af53c529.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4300_3de6720c8411a8477891.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4300_3de6720c8411a8477891.png
new file mode 100644
index 0000000000000000000000000000000000000000..829769dd50739350061c664eb6b9945703f2eac5
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4300_3de6720c8411a8477891.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4300_588bc61677f3bab054c8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4300_588bc61677f3bab054c8.png
new file mode 100644
index 0000000000000000000000000000000000000000..5f6d9d1ae1a3dcc33d20aee0cee08de17d94539f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4300_588bc61677f3bab054c8.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4300_7d7a75b80eff6dd9c044.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4300_7d7a75b80eff6dd9c044.png
new file mode 100644
index 0000000000000000000000000000000000000000..065e7ac624826ee5de70fc03c9e5e96f52a20c37
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4300_7d7a75b80eff6dd9c044.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4300_fef1257f77662b6c4725.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4300_fef1257f77662b6c4725.png
new file mode 100644
index 0000000000000000000000000000000000000000..97e3128daaa20bcd812888c7aaf79b0669c4f76d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4300_fef1257f77662b6c4725.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4500_5131aa659129ef359430.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4500_5131aa659129ef359430.png
new file mode 100644
index 0000000000000000000000000000000000000000..880cdbaefde81ec19ab44e4a6ea22db36b52943e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4500_5131aa659129ef359430.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4500_cb1f9f7430f1f4626051.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4500_cb1f9f7430f1f4626051.png
new file mode 100644
index 0000000000000000000000000000000000000000..042d0f38952139a6396f4a56e42b48d49d6d4e1c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4500_cb1f9f7430f1f4626051.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4500_ea09160793af079617c0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4500_ea09160793af079617c0.png
new file mode 100644
index 0000000000000000000000000000000000000000..8fd4e6f723fada4cb5bb9fd30bb95fc0a2e7e8f7
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4500_ea09160793af079617c0.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4500_f590bbc2d2e2de67e321.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4500_f590bbc2d2e2de67e321.png
new file mode 100644
index 0000000000000000000000000000000000000000..7e9efd77c0f4ef08829b29273feb9d283289c137
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4500_f590bbc2d2e2de67e321.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4700_3dd3f3c86c057912475e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4700_3dd3f3c86c057912475e.png
new file mode 100644
index 0000000000000000000000000000000000000000..91d564adefb3b835df76edfb5925909b883831cb
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4700_3dd3f3c86c057912475e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4700_4558a419982639929bbf.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4700_4558a419982639929bbf.png
new file mode 100644
index 0000000000000000000000000000000000000000..cffa9aee0761f424c3148f957f2059cccb418142
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4700_4558a419982639929bbf.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4700_4a6b826ac1318a1c13c5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4700_4a6b826ac1318a1c13c5.png
new file mode 100644
index 0000000000000000000000000000000000000000..0ea46fc0f03b19abae897e9a26793969570a069e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4700_4a6b826ac1318a1c13c5.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4700_c35de50b9d31122569c8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4700_c35de50b9d31122569c8.png
new file mode 100644
index 0000000000000000000000000000000000000000..7ae07179d4fab0811d212cb4c678f98b0523c110
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4700_c35de50b9d31122569c8.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4900_65dbe42ee6d4fe96a972.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4900_65dbe42ee6d4fe96a972.png
new file mode 100644
index 0000000000000000000000000000000000000000..ad7d0d10226318b7d250ced9b475212b300e8ca9
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4900_65dbe42ee6d4fe96a972.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4900_972903c5fe10e1780356.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4900_972903c5fe10e1780356.png
new file mode 100644
index 0000000000000000000000000000000000000000..0fff1c10e7f3dca67c3c3163c635c481d9219500
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4900_972903c5fe10e1780356.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4900_f1251c24fce35b521464.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4900_f1251c24fce35b521464.png
new file mode 100644
index 0000000000000000000000000000000000000000..6edc13d6453e48cc59d521e9dfd46aa152002b43
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4900_f1251c24fce35b521464.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4900_fdd0c5b1fc768ba98d68.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4900_fdd0c5b1fc768ba98d68.png
new file mode 100644
index 0000000000000000000000000000000000000000..90d227f807415686730328ee1de999ebf5ce32b0
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_4900_fdd0c5b1fc768ba98d68.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_500_7a7ece967152021ab306.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_500_7a7ece967152021ab306.png
new file mode 100644
index 0000000000000000000000000000000000000000..3999b2d42c9f939cba48a2a5497d50bb05fa86cc
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_500_7a7ece967152021ab306.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_500_95c652e60d3cd4c6cc22.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_500_95c652e60d3cd4c6cc22.png
new file mode 100644
index 0000000000000000000000000000000000000000..8e72931b761ea699bf8e641172d355b6c4986971
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_500_95c652e60d3cd4c6cc22.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_500_a7c8484d0891da467ee9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_500_a7c8484d0891da467ee9.png
new file mode 100644
index 0000000000000000000000000000000000000000..a70026076e293daef41e1218fb6545b9837836fb
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_500_a7c8484d0891da467ee9.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_500_aa5dd8138f82ba3818b7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_500_aa5dd8138f82ba3818b7.png
new file mode 100644
index 0000000000000000000000000000000000000000..374734d6ea5029606979b15f11b6e04433e3e15c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_500_aa5dd8138f82ba3818b7.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5100_2db831e241e1ef0a279c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5100_2db831e241e1ef0a279c.png
new file mode 100644
index 0000000000000000000000000000000000000000..922bbd86d5233329da0776bce1653b399677fcab
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5100_2db831e241e1ef0a279c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5100_67692e49619b5279e040.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5100_67692e49619b5279e040.png
new file mode 100644
index 0000000000000000000000000000000000000000..b1a668658ed0b4ed500c08ea002c8d90b363148c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5100_67692e49619b5279e040.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:67692e49619b5279e040c2fcb7835c1ca696947ab7834533cf73d578d8ce5b75
+size 116515
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5100_9d6d5090cb4b0bec7f4b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5100_9d6d5090cb4b0bec7f4b.png
new file mode 100644
index 0000000000000000000000000000000000000000..b591a768144253d429315d389dd2e6999d3469de
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5100_9d6d5090cb4b0bec7f4b.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5100_bca25025a17d1889b2f9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5100_bca25025a17d1889b2f9.png
new file mode 100644
index 0000000000000000000000000000000000000000..3532c68bd3073c82bd56786a84006bb9a8b95dfc
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5100_bca25025a17d1889b2f9.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5300_b65a3d2adff3e1c091e3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5300_b65a3d2adff3e1c091e3.png
new file mode 100644
index 0000000000000000000000000000000000000000..5cd1c0d619712edf8959b01f562f324c43480f5a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5300_b65a3d2adff3e1c091e3.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5300_caf93157083ddb5a1d71.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5300_caf93157083ddb5a1d71.png
new file mode 100644
index 0000000000000000000000000000000000000000..9d4192ac45912085eae8cc12620297d46b6dc103
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5300_caf93157083ddb5a1d71.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5300_e07f0ad89b4a68264d8c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5300_e07f0ad89b4a68264d8c.png
new file mode 100644
index 0000000000000000000000000000000000000000..91253e99ac2f71d33f9b312f080e1877d407816b
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5300_e07f0ad89b4a68264d8c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5300_f64db927d1ffbf023230.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5300_f64db927d1ffbf023230.png
new file mode 100644
index 0000000000000000000000000000000000000000..a6f681d1164dcb3d86d0a2dbc9694602f6da31df
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5300_f64db927d1ffbf023230.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5500_0514c35ecfbbebba4fa9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5500_0514c35ecfbbebba4fa9.png
new file mode 100644
index 0000000000000000000000000000000000000000..6973ec5caced8740705d302b6e83113d9f446b2f
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5500_0514c35ecfbbebba4fa9.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5500_07bd71675e70ccaa6b6b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5500_07bd71675e70ccaa6b6b.png
new file mode 100644
index 0000000000000000000000000000000000000000..d84e1a938fa8f0b338c2944f3b944f8d00f02a1b
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5500_07bd71675e70ccaa6b6b.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5500_39b976cccc17d20a752c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5500_39b976cccc17d20a752c.png
new file mode 100644
index 0000000000000000000000000000000000000000..32b4b6aea26581cff361c2bab9dbaef34bf6ce9e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5500_39b976cccc17d20a752c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5500_63ef17c3d730449b5cee.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5500_63ef17c3d730449b5cee.png
new file mode 100644
index 0000000000000000000000000000000000000000..19a258d362d16210344f0fcca3a6be41aece4529
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5500_63ef17c3d730449b5cee.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5700_11f5ac55ef9bea4c44b5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5700_11f5ac55ef9bea4c44b5.png
new file mode 100644
index 0000000000000000000000000000000000000000..4f2cbcf05f656d5fe84fcd9c3042f6aa221f7794
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5700_11f5ac55ef9bea4c44b5.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5700_14905a1f2a8e6c8cab39.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5700_14905a1f2a8e6c8cab39.png
new file mode 100644
index 0000000000000000000000000000000000000000..e5751b0770dfd074c6932429a17cd225d2d414f9
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5700_14905a1f2a8e6c8cab39.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5700_94636eefe326e9583c91.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5700_94636eefe326e9583c91.png
new file mode 100644
index 0000000000000000000000000000000000000000..9dd9b217d2a55f021cc1be05e99d14442a9eddfd
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5700_94636eefe326e9583c91.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5700_f920b8b441d7c0941b21.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5700_f920b8b441d7c0941b21.png
new file mode 100644
index 0000000000000000000000000000000000000000..bc45b70329cffdd8410bd507c8606d4340934d02
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5700_f920b8b441d7c0941b21.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5900_ad7196a38f3f2173e31a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5900_ad7196a38f3f2173e31a.png
new file mode 100644
index 0000000000000000000000000000000000000000..956bdb41fc79620e4bc128ed5f03cbce55f33366
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5900_ad7196a38f3f2173e31a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5900_bfbba74f67375c28a7ee.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5900_bfbba74f67375c28a7ee.png
new file mode 100644
index 0000000000000000000000000000000000000000..d5ddbeeeed0ee68857afc0e6a2c317a7c5b9224d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5900_bfbba74f67375c28a7ee.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5900_e8591faf9026826625a7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5900_e8591faf9026826625a7.png
new file mode 100644
index 0000000000000000000000000000000000000000..d64fb70822318ea67bdfd037a8838d964b53915c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5900_e8591faf9026826625a7.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5900_f01c1bdb2ba62ed0d9b5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5900_f01c1bdb2ba62ed0d9b5.png
new file mode 100644
index 0000000000000000000000000000000000000000..a0a9dbc7d16c84e3e995db2669725a9406e6ae6b
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_5900_f01c1bdb2ba62ed0d9b5.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6100_08eef94b3e2ea791f1f7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6100_08eef94b3e2ea791f1f7.png
new file mode 100644
index 0000000000000000000000000000000000000000..71bc669bf94cb28882e6d46c23631948b39d28bd
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6100_08eef94b3e2ea791f1f7.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6100_0a79ce5df977d6485d88.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6100_0a79ce5df977d6485d88.png
new file mode 100644
index 0000000000000000000000000000000000000000..8b49135d7ce5497b8dcb3d815f53e24dbd79b1eb
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6100_0a79ce5df977d6485d88.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6100_95a675226df51d1aa3b6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6100_95a675226df51d1aa3b6.png
new file mode 100644
index 0000000000000000000000000000000000000000..67b0f67fef276622f9d95f69b18121d6b80b8a1e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6100_95a675226df51d1aa3b6.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6100_b8d2ae70ae98e363a614.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6100_b8d2ae70ae98e363a614.png
new file mode 100644
index 0000000000000000000000000000000000000000..6c5a63ef7715e882026decdd4bf88d4ecac00563
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6100_b8d2ae70ae98e363a614.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6300_3bcbff80252e702f6869.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6300_3bcbff80252e702f6869.png
new file mode 100644
index 0000000000000000000000000000000000000000..0e3d1a674385114b0f57b165d20fc86b82e81a92
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6300_3bcbff80252e702f6869.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6300_8abfa4e92391880dd650.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6300_8abfa4e92391880dd650.png
new file mode 100644
index 0000000000000000000000000000000000000000..74ee5742256d38409e15e55e9b21495647bc6fbb
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6300_8abfa4e92391880dd650.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6300_c53db8093d0665065d6f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6300_c53db8093d0665065d6f.png
new file mode 100644
index 0000000000000000000000000000000000000000..6161fb0723ff848ec513ceed5501154f3311b35b
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6300_c53db8093d0665065d6f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6300_e8ef6ca4de499997d60e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6300_e8ef6ca4de499997d60e.png
new file mode 100644
index 0000000000000000000000000000000000000000..5c3e0ad4c026ed919c1a8e41344c525f7e161647
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6300_e8ef6ca4de499997d60e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6500_135e9dd23a69a2ed54b5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6500_135e9dd23a69a2ed54b5.png
new file mode 100644
index 0000000000000000000000000000000000000000..89c5c43cb2741dd05b1347db3af2dd9904b0b9f1
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6500_135e9dd23a69a2ed54b5.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6500_3204a5145d10bd785fb6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6500_3204a5145d10bd785fb6.png
new file mode 100644
index 0000000000000000000000000000000000000000..657ec26afd196e71c05a8cda70ce051ef862a94c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6500_3204a5145d10bd785fb6.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6500_c86ea72724d19f17c816.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6500_c86ea72724d19f17c816.png
new file mode 100644
index 0000000000000000000000000000000000000000..2b6aac84a53058d7a5b23eceb2046f4ae724ddc8
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6500_c86ea72724d19f17c816.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6500_f9740ddff1a04f80e0ea.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6500_f9740ddff1a04f80e0ea.png
new file mode 100644
index 0000000000000000000000000000000000000000..09f92256e10d14e34c1d6cc4cc8ea2f8fc171fde
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6500_f9740ddff1a04f80e0ea.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6700_427bf1070d544d4049da.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6700_427bf1070d544d4049da.png
new file mode 100644
index 0000000000000000000000000000000000000000..db0d90a65d291184fccac034593e2de236f01727
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6700_427bf1070d544d4049da.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6700_a918031bb0a09f2ef390.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6700_a918031bb0a09f2ef390.png
new file mode 100644
index 0000000000000000000000000000000000000000..41dd8b393e8107394967d309a8d61d7200baa0ac
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6700_a918031bb0a09f2ef390.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6700_d0627b3a8a1293fbf958.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6700_d0627b3a8a1293fbf958.png
new file mode 100644
index 0000000000000000000000000000000000000000..e30a3521b0771f797be08364c15dcb06e78cac9c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6700_d0627b3a8a1293fbf958.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6700_e00807fffaed8846bde9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6700_e00807fffaed8846bde9.png
new file mode 100644
index 0000000000000000000000000000000000000000..334f939462600b9ca2286848aa1ca608e87aa0c9
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6700_e00807fffaed8846bde9.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6900_15a5ca6cebb43a21c745.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6900_15a5ca6cebb43a21c745.png
new file mode 100644
index 0000000000000000000000000000000000000000..144928b9556e40b1ba4f04eff8e8d3b7587b5720
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6900_15a5ca6cebb43a21c745.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6900_17aebe306943c01fefa8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6900_17aebe306943c01fefa8.png
new file mode 100644
index 0000000000000000000000000000000000000000..9150526cf3adbd0afc94b8c9b9f62387130fb60a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6900_17aebe306943c01fefa8.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6900_cdecb623d916f8b469cc.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6900_cdecb623d916f8b469cc.png
new file mode 100644
index 0000000000000000000000000000000000000000..b465eb3606ef508e9cce6488a716fd8d1c1cb6ba
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6900_cdecb623d916f8b469cc.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6900_fe9d1f8827b3d0f7dcda.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6900_fe9d1f8827b3d0f7dcda.png
new file mode 100644
index 0000000000000000000000000000000000000000..e6594505ae6f9b092553c5a1a76ef01fa0d00360
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_6900_fe9d1f8827b3d0f7dcda.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_700_2b15a3ee35ff82053584.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_700_2b15a3ee35ff82053584.png
new file mode 100644
index 0000000000000000000000000000000000000000..bc67045f2015d7ddcd7976140e4ab3a1654d49bc
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_700_2b15a3ee35ff82053584.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_700_c40d74e1f54e0de526be.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_700_c40d74e1f54e0de526be.png
new file mode 100644
index 0000000000000000000000000000000000000000..c3599447646087f762222d0f3ca175e6691122c8
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_700_c40d74e1f54e0de526be.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_700_da45c898dafd2e44e971.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_700_da45c898dafd2e44e971.png
new file mode 100644
index 0000000000000000000000000000000000000000..c5289cfa8785d509f6bfd42f93650e22fe69e03c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_700_da45c898dafd2e44e971.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_700_e8be4d1d405fed417f2b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_700_e8be4d1d405fed417f2b.png
new file mode 100644
index 0000000000000000000000000000000000000000..b67ef8d290e19d8f1e1c56eeff0c069187f8dfc9
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_700_e8be4d1d405fed417f2b.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7100_0eb06e0dedffdf7c384a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7100_0eb06e0dedffdf7c384a.png
new file mode 100644
index 0000000000000000000000000000000000000000..06887ee3e2249a9297a7ddb42e190e9ba2f6dee5
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7100_0eb06e0dedffdf7c384a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7100_1e8feb063f6a9ec78960.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7100_1e8feb063f6a9ec78960.png
new file mode 100644
index 0000000000000000000000000000000000000000..c278cdd756c78da8e23831ad6d59afbefaa10995
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7100_1e8feb063f6a9ec78960.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7100_5ccd65818f5b544ac9f3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7100_5ccd65818f5b544ac9f3.png
new file mode 100644
index 0000000000000000000000000000000000000000..d80df5a720ca7eb97bdd5824fa71ed4f884a5cc4
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7100_5ccd65818f5b544ac9f3.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7100_6a43d3b1b1cf2428b09e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7100_6a43d3b1b1cf2428b09e.png
new file mode 100644
index 0000000000000000000000000000000000000000..ba4439e9260321a2d02e0252d0fd5e36315839da
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7100_6a43d3b1b1cf2428b09e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7300_01cdf4dbb914d6f8dd64.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7300_01cdf4dbb914d6f8dd64.png
new file mode 100644
index 0000000000000000000000000000000000000000..931096fc73a005059c0b3bc12dee62f53db6d5ec
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7300_01cdf4dbb914d6f8dd64.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7300_3ac9fc2a95ded732cc7d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7300_3ac9fc2a95ded732cc7d.png
new file mode 100644
index 0000000000000000000000000000000000000000..9f5925046d1120e9b0894b80742e9a7a9065c147
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7300_3ac9fc2a95ded732cc7d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7300_4119d698ca559f516e33.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7300_4119d698ca559f516e33.png
new file mode 100644
index 0000000000000000000000000000000000000000..3484666bca07b30177011f0d57086be6d3f0589f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7300_4119d698ca559f516e33.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4119d698ca559f516e336205bc47c937e807d65bdeae75589ae15f01f67d2f20
+size 113516
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7300_923361193af5e96077e0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7300_923361193af5e96077e0.png
new file mode 100644
index 0000000000000000000000000000000000000000..10e93fdc870afe02f2a1accb9e45d68b7060b8f5
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7300_923361193af5e96077e0.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7500_35e0b3b14bcd0b203dc7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7500_35e0b3b14bcd0b203dc7.png
new file mode 100644
index 0000000000000000000000000000000000000000..20118e0f72031389d701d6e02236a042bf99ec25
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7500_35e0b3b14bcd0b203dc7.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7500_661dbf65b8f1df7a8b3a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7500_661dbf65b8f1df7a8b3a.png
new file mode 100644
index 0000000000000000000000000000000000000000..b7110f927bf11dfb01d4185d100398db0a3e8e0a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7500_661dbf65b8f1df7a8b3a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7500_90fd13c5432416ab6daf.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7500_90fd13c5432416ab6daf.png
new file mode 100644
index 0000000000000000000000000000000000000000..cfde0a5df74abb9886114e617c0f130e9dd8fa6c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7500_90fd13c5432416ab6daf.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7500_c60cda7126d5f78050da.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7500_c60cda7126d5f78050da.png
new file mode 100644
index 0000000000000000000000000000000000000000..7bcfaac2da9821cb9f231e63dd98429e22054b55
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7500_c60cda7126d5f78050da.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7700_3fae4f238409470c3be6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7700_3fae4f238409470c3be6.png
new file mode 100644
index 0000000000000000000000000000000000000000..8e044c1ba4f65ec066533c1e66b5bded950563d3
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7700_3fae4f238409470c3be6.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7700_42f157f6bb94a5c778fa.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7700_42f157f6bb94a5c778fa.png
new file mode 100644
index 0000000000000000000000000000000000000000..76425e7e05ea4de1ab7a8c3e679066a289f40d12
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7700_42f157f6bb94a5c778fa.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7700_9e04d6eae1646c7ef4bb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7700_9e04d6eae1646c7ef4bb.png
new file mode 100644
index 0000000000000000000000000000000000000000..432a7be111ce246a5443ef22a249329428df2c3a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7700_9e04d6eae1646c7ef4bb.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7700_f356c0bd24e8604a4246.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7700_f356c0bd24e8604a4246.png
new file mode 100644
index 0000000000000000000000000000000000000000..1bb6d03d515301be94ac77d645ace0e048290149
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7700_f356c0bd24e8604a4246.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7900_0a68e10709f43a8aea48.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7900_0a68e10709f43a8aea48.png
new file mode 100644
index 0000000000000000000000000000000000000000..8ac67c26fab3d053cbb6d6382b782bf673917a8d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7900_0a68e10709f43a8aea48.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7900_10187bb758bdd334007d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7900_10187bb758bdd334007d.png
new file mode 100644
index 0000000000000000000000000000000000000000..426efbb3ac6958b28770d26ab6030f208aae2349
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7900_10187bb758bdd334007d.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7900_1809f4c3cdc8e39f53c9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7900_1809f4c3cdc8e39f53c9.png
new file mode 100644
index 0000000000000000000000000000000000000000..a610c5f867574683cabf47d3b90a64c579268c79
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7900_1809f4c3cdc8e39f53c9.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7900_405bf3cc17a4e3fdbeee.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7900_405bf3cc17a4e3fdbeee.png
new file mode 100644
index 0000000000000000000000000000000000000000..47562c5390756d47931e70a9645d020668b1ff73
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_7900_405bf3cc17a4e3fdbeee.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8100_064525ed57bfcb5a29b4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8100_064525ed57bfcb5a29b4.png
new file mode 100644
index 0000000000000000000000000000000000000000..a2790a324e52693ac8c2999c59485497d19354d6
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8100_064525ed57bfcb5a29b4.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8100_3960aa3a84140b0fb06e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8100_3960aa3a84140b0fb06e.png
new file mode 100644
index 0000000000000000000000000000000000000000..b126ee6dc3c2d672be92d354e70f42a374519750
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8100_3960aa3a84140b0fb06e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8100_cc3094bb447ae302fe3b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8100_cc3094bb447ae302fe3b.png
new file mode 100644
index 0000000000000000000000000000000000000000..c00a36c7300e17be1a7cd539a4879827b878e2f8
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8100_cc3094bb447ae302fe3b.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8100_eaddbbad7b32caacb6f4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8100_eaddbbad7b32caacb6f4.png
new file mode 100644
index 0000000000000000000000000000000000000000..6a906ec4c9517f364ef05762427c9fc64530b5b4
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8100_eaddbbad7b32caacb6f4.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8300_271cd25145d107c70da7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8300_271cd25145d107c70da7.png
new file mode 100644
index 0000000000000000000000000000000000000000..9dbd5c4e74b2d4012dc7faedffdaa8db568ebad9
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8300_271cd25145d107c70da7.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8300_4dec11fba27e62859ed9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8300_4dec11fba27e62859ed9.png
new file mode 100644
index 0000000000000000000000000000000000000000..1587126e931c9dbd726d4dc163da0fcd94a91137
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8300_4dec11fba27e62859ed9.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8300_5f054b9380eaf3c3c96c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8300_5f054b9380eaf3c3c96c.png
new file mode 100644
index 0000000000000000000000000000000000000000..72f43ac7448fc64edfd80b47a75f13ab1e5a6f82
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8300_5f054b9380eaf3c3c96c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8300_b864dcad9fde786a5a2e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8300_b864dcad9fde786a5a2e.png
new file mode 100644
index 0000000000000000000000000000000000000000..a46c1099894e4da5bce9355aebb9d83a111b76d4
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8300_b864dcad9fde786a5a2e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8500_563c9db4bdf64de447a5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8500_563c9db4bdf64de447a5.png
new file mode 100644
index 0000000000000000000000000000000000000000..4682490fb8c95a542734a438703e62f1cf7af7a8
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8500_563c9db4bdf64de447a5.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8500_5fc9a05dd97923b0da63.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8500_5fc9a05dd97923b0da63.png
new file mode 100644
index 0000000000000000000000000000000000000000..01650c9ae42ea516be64d1f33b315860440bb56c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8500_5fc9a05dd97923b0da63.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8500_7e95bd1a37fb3d340e4c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8500_7e95bd1a37fb3d340e4c.png
new file mode 100644
index 0000000000000000000000000000000000000000..f9bd96e072bee67200f32c894d9c4955ea36d9ff
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8500_7e95bd1a37fb3d340e4c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7e95bd1a37fb3d340e4cb4e0999dfbea93dd4122c50d7f43b737840766ec15d0
+size 103894
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8500_c2c5ebfcfe52c0ad52df.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8500_c2c5ebfcfe52c0ad52df.png
new file mode 100644
index 0000000000000000000000000000000000000000..5caadc4f82155dfc9588b536ecd68dac7cf59ce2
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8500_c2c5ebfcfe52c0ad52df.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8700_45970e04afd341d0f0f3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8700_45970e04afd341d0f0f3.png
new file mode 100644
index 0000000000000000000000000000000000000000..b6ffb01bf4e53ca5d65f243b9ad6ebca79b6c3fe
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8700_45970e04afd341d0f0f3.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8700_65b2440bcf69078d48b4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8700_65b2440bcf69078d48b4.png
new file mode 100644
index 0000000000000000000000000000000000000000..3aa4a415400fb3e9fcf58c3d9014c7612fa29f64
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8700_65b2440bcf69078d48b4.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8700_e6bdb1cb68ec4a2c5638.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8700_e6bdb1cb68ec4a2c5638.png
new file mode 100644
index 0000000000000000000000000000000000000000..29fcbf26a85208c99a8c44d7938249e32861daaa
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8700_e6bdb1cb68ec4a2c5638.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8700_ec241a973700202c9b87.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8700_ec241a973700202c9b87.png
new file mode 100644
index 0000000000000000000000000000000000000000..f694294d53586ac57a7f9636ddb1cf80d34581ff
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8700_ec241a973700202c9b87.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8900_03961e97b9d29311797e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8900_03961e97b9d29311797e.png
new file mode 100644
index 0000000000000000000000000000000000000000..3199b68e576141d3f94143111f7ba4b4382e5999
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8900_03961e97b9d29311797e.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8900_12b29bf737fe6b7af18b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8900_12b29bf737fe6b7af18b.png
new file mode 100644
index 0000000000000000000000000000000000000000..9ef93537d5d4edae5e318810ad48a451e8e63379
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8900_12b29bf737fe6b7af18b.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8900_b8edfaad164ff4edf7f2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8900_b8edfaad164ff4edf7f2.png
new file mode 100644
index 0000000000000000000000000000000000000000..990ce2c29c7a89b382cbfe672f33e17c50f9d5dc
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8900_b8edfaad164ff4edf7f2.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8900_ca1143b44fad70ab499c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8900_ca1143b44fad70ab499c.png
new file mode 100644
index 0000000000000000000000000000000000000000..2880b19d0ac284d6cab8b72fc2f3d79417605285
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_8900_ca1143b44fad70ab499c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_900_5d78e09242db518ee09f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_900_5d78e09242db518ee09f.png
new file mode 100644
index 0000000000000000000000000000000000000000..9d478c8f0747a6f717473a2993639a7b3353ae19
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_900_5d78e09242db518ee09f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_900_63f799cde54758b83bc4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_900_63f799cde54758b83bc4.png
new file mode 100644
index 0000000000000000000000000000000000000000..39776e131d94fa6fe6525b24f863830f30c67b9a
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_900_63f799cde54758b83bc4.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_900_8e074877f728e0c6b27a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_900_8e074877f728e0c6b27a.png
new file mode 100644
index 0000000000000000000000000000000000000000..3fcbf5f91d3b44def930af6d0447390326cfa721
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_900_8e074877f728e0c6b27a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_900_90aa3c4fbaa5540df2e5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_900_90aa3c4fbaa5540df2e5.png
new file mode 100644
index 0000000000000000000000000000000000000000..f943be031a1246bd1e1442703947edba74b2ba21
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_900_90aa3c4fbaa5540df2e5.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9100_3f01357f9759004fe5cb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9100_3f01357f9759004fe5cb.png
new file mode 100644
index 0000000000000000000000000000000000000000..a389c063b7568060a121c3192151c9cc62cbe4b5
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9100_3f01357f9759004fe5cb.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9100_8dd9c5ff1e2caeed7b29.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9100_8dd9c5ff1e2caeed7b29.png
new file mode 100644
index 0000000000000000000000000000000000000000..8a257da739ab84b488450bd258fb12625ad70971
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9100_8dd9c5ff1e2caeed7b29.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9100_8e98e0e82c6aecbdec2a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9100_8e98e0e82c6aecbdec2a.png
new file mode 100644
index 0000000000000000000000000000000000000000..98847379c04dc8b1671ee727e34b35610d83f8b2
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9100_8e98e0e82c6aecbdec2a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9100_e5262475ae1b8383e131.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9100_e5262475ae1b8383e131.png
new file mode 100644
index 0000000000000000000000000000000000000000..af30020fd05f5767dc60279df542638f32bf1d39
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9100_e5262475ae1b8383e131.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9300_40e054a1f62067987b8f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9300_40e054a1f62067987b8f.png
new file mode 100644
index 0000000000000000000000000000000000000000..e02c12d41be826ba2ac98d0e3a6b09b3510c8ad4
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9300_40e054a1f62067987b8f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9300_6662eee5e0006b127f8a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9300_6662eee5e0006b127f8a.png
new file mode 100644
index 0000000000000000000000000000000000000000..70969d595127e884e51703b71154a0bc7b6e5e3e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9300_6662eee5e0006b127f8a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9300_ef44b946cf04efec4eee.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9300_ef44b946cf04efec4eee.png
new file mode 100644
index 0000000000000000000000000000000000000000..032a98e548e3242601238ce41758232f09c14ea1
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9300_ef44b946cf04efec4eee.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9300_fd6e3fe3c28702ce1487.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9300_fd6e3fe3c28702ce1487.png
new file mode 100644
index 0000000000000000000000000000000000000000..42b8fdca3e4c7eb505d03847073c83f3fae12f64
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9300_fd6e3fe3c28702ce1487.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9500_1bbe0093233889a1949c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9500_1bbe0093233889a1949c.png
new file mode 100644
index 0000000000000000000000000000000000000000..1ac22ac3dcf45c80f8baf1265ac49651459c090c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9500_1bbe0093233889a1949c.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9500_3117170128a197034f84.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9500_3117170128a197034f84.png
new file mode 100644
index 0000000000000000000000000000000000000000..2d56256e49d6b027140a61ce703d781fb3511f0d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9500_3117170128a197034f84.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9500_7fad6cfaa3adecbff870.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9500_7fad6cfaa3adecbff870.png
new file mode 100644
index 0000000000000000000000000000000000000000..c24b0f91080c562dc479ffd58e0fca578c752275
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9500_7fad6cfaa3adecbff870.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9500_a02850ad11ddd5d209ab.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9500_a02850ad11ddd5d209ab.png
new file mode 100644
index 0000000000000000000000000000000000000000..352bf417d1e303a8e4cba55f74fb46cbca65ab96
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9500_a02850ad11ddd5d209ab.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9700_7dae665a31341e8b7d40.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9700_7dae665a31341e8b7d40.png
new file mode 100644
index 0000000000000000000000000000000000000000..c6569a86055bb2fbf28f6e66a47af8aab1f705c2
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9700_7dae665a31341e8b7d40.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9700_aad34b2467f1a48577ac.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9700_aad34b2467f1a48577ac.png
new file mode 100644
index 0000000000000000000000000000000000000000..52e58a584e66a787f64ae934a8fdab8ce0507e65
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9700_aad34b2467f1a48577ac.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9700_eb351daf3e1d585ad65f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9700_eb351daf3e1d585ad65f.png
new file mode 100644
index 0000000000000000000000000000000000000000..d4c88ae3616bc60b4853b0d7efd1e1ad21076774
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9700_eb351daf3e1d585ad65f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9700_efd3a0c94f77b86bec1a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9700_efd3a0c94f77b86bec1a.png
new file mode 100644
index 0000000000000000000000000000000000000000..310a0d4f1c35fa1f563f66053ec58e23aca67074
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9700_efd3a0c94f77b86bec1a.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9900_07e54dcc7777d9495723.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9900_07e54dcc7777d9495723.png
new file mode 100644
index 0000000000000000000000000000000000000000..4bd08bf1d6bf84451b6f43c51cdf351965e891c3
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9900_07e54dcc7777d9495723.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9900_7a7f9ef929bed578a064.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9900_7a7f9ef929bed578a064.png
new file mode 100644
index 0000000000000000000000000000000000000000..c53ca13e1edc71b2f4642d0faa86afd9d1bf697e
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9900_7a7f9ef929bed578a064.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9900_c36085baef021cf25dbe.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9900_c36085baef021cf25dbe.png
new file mode 100644
index 0000000000000000000000000000000000000000..9603da4d9ae2de3f62a1f5ec24f7f8019d17fa05
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9900_c36085baef021cf25dbe.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9900_c56afe5370013ad64f71.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9900_c56afe5370013ad64f71.png
new file mode 100644
index 0000000000000000000000000000000000000000..f406c1a5ed19866124fb726cfe5ce112daf5e69c
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_first_frame_9900_c56afe5370013ad64f71.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10100_093b1a5034c6b9b3dbb7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10100_093b1a5034c6b9b3dbb7.png
new file mode 100644
index 0000000000000000000000000000000000000000..9e48d81f3bc2acbcfe0a3b82a9fc5b34c08bb237
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10100_093b1a5034c6b9b3dbb7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:093b1a5034c6b9b3dbb78ff85ae72cb408884b733582abe18063844d1cbfaa03
+size 588420
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10100_ae921b2ea86bada035c7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10100_ae921b2ea86bada035c7.png
new file mode 100644
index 0000000000000000000000000000000000000000..ddd49ef626ac90638bd2fb75c5038b29c0e5bb43
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10100_ae921b2ea86bada035c7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ae921b2ea86bada035c762e760529f2dde282ba9e3351401c597a805029f2964
+size 557769
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10100_af3c8b7311a670553ae9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10100_af3c8b7311a670553ae9.png
new file mode 100644
index 0000000000000000000000000000000000000000..78582a9775f0e2050f1191b85f9fa0cd4a2e67a9
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10100_af3c8b7311a670553ae9.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:af3c8b7311a670553ae96032b14d5d326aa6a18718fbb324b5263a9bed909473
+size 774328
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10100_e0bcf1cfa29f67e8afc7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10100_e0bcf1cfa29f67e8afc7.png
new file mode 100644
index 0000000000000000000000000000000000000000..8323db9d9e07c775486141db6ec4c2b8e9ebe52e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10100_e0bcf1cfa29f67e8afc7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e0bcf1cfa29f67e8afc75a821a6f2c290af3cdf71ec2b4fb6a39fcb5e18caace
+size 344559
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10300_0176b14c37bba7f7e696.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10300_0176b14c37bba7f7e696.png
new file mode 100644
index 0000000000000000000000000000000000000000..4e6bcd8e519f08a37f64a3c1b7e96f73f2d05b5d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10300_0176b14c37bba7f7e696.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0176b14c37bba7f7e69649596b01fdd3372c18d4028a2331b99d0c2e59e1d0af
+size 281010
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10300_5fcdf80cc9a796b3f7e8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10300_5fcdf80cc9a796b3f7e8.png
new file mode 100644
index 0000000000000000000000000000000000000000..88de492d29206725a66e6d1544701361a864b6ea
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10300_5fcdf80cc9a796b3f7e8.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5fcdf80cc9a796b3f7e8735400da7114cdc4e061ee1990c6252ea8dbac161e10
+size 701092
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10300_aaed675112c3ab1ac971.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10300_aaed675112c3ab1ac971.png
new file mode 100644
index 0000000000000000000000000000000000000000..58b080b902d72ef5b10525d7871f5c36e24f2c5d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10300_aaed675112c3ab1ac971.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:aaed675112c3ab1ac971e88298bdacaaf4c84b55ebb8b4a5c7138f2b0da20dca
+size 648286
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10300_da3ab3e58086a32e0b73.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10300_da3ab3e58086a32e0b73.png
new file mode 100644
index 0000000000000000000000000000000000000000..412da92a87234ba77407107614f30f5a5a97cf59
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10300_da3ab3e58086a32e0b73.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:da3ab3e58086a32e0b7332904c83d2ef54a4b8483eaa000e0a2367cc25779c2b
+size 473164
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10500_2943649aae1ca4575db6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10500_2943649aae1ca4575db6.png
new file mode 100644
index 0000000000000000000000000000000000000000..f8eca14cd9d32d848a13333640e8f0184758d8ef
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10500_2943649aae1ca4575db6.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2943649aae1ca4575db63acbc44b433367ff80c4a79d4e06409550a84ab5bd97
+size 1031398
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10500_3b89c012fcc8524e9786.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10500_3b89c012fcc8524e9786.png
new file mode 100644
index 0000000000000000000000000000000000000000..5008248fb1cc146c8680c435e91499bd8fa15335
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10500_3b89c012fcc8524e9786.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3b89c012fcc8524e9786306ff0bd5803b9bc4347c4e96c9bbbe36b645dd40d1f
+size 317654
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10500_8ff5a65f0ada39efd758.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10500_8ff5a65f0ada39efd758.png
new file mode 100644
index 0000000000000000000000000000000000000000..7d597487e06312d2638a1ea46fd9a9feabe14ca8
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10500_8ff5a65f0ada39efd758.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8ff5a65f0ada39efd758d2775a00f7cd2e8a180daa1473e4013e40ad36c28802
+size 1000914
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10500_98ae64c7abea19354d57.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10500_98ae64c7abea19354d57.png
new file mode 100644
index 0000000000000000000000000000000000000000..8a57eaac76c9aa9383c21eca6ed7d00234a3e920
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10500_98ae64c7abea19354d57.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:98ae64c7abea19354d57705c8c0d10858859455c6309ffcb1e3fb229a22eae24
+size 671216
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10700_3f7471f50bebe39d3849.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10700_3f7471f50bebe39d3849.png
new file mode 100644
index 0000000000000000000000000000000000000000..294b4cddb8bc8e477894bcca6ad1a950f9f8bd68
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10700_3f7471f50bebe39d3849.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3f7471f50bebe39d3849f85467f5e5d33ca571b54fada83c78196dd5473b745c
+size 534235
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10700_49d1d68d8000d3d19088.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10700_49d1d68d8000d3d19088.png
new file mode 100644
index 0000000000000000000000000000000000000000..8e36fe9a5cafa0c5b41666b1f10c4770a27274a3
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10700_49d1d68d8000d3d19088.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:49d1d68d8000d3d190881a875e9666202ee714f9122add6842bf796a6f16ec6c
+size 630673
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10700_6968f7398fa007d49a84.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10700_6968f7398fa007d49a84.png
new file mode 100644
index 0000000000000000000000000000000000000000..e7319bdf1386d2a398a21206db44a59e3d44ac76
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10700_6968f7398fa007d49a84.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6968f7398fa007d49a847c15e8347032f488ea0141496aae114231a22722ec6b
+size 791944
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10700_8b806d8174a28e24cf30.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10700_8b806d8174a28e24cf30.png
new file mode 100644
index 0000000000000000000000000000000000000000..219537904081bf71f51983220f0ed10085f9389d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10700_8b806d8174a28e24cf30.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8b806d8174a28e24cf3097c1d0837f6261e403c8f1b135da954b060be382230a
+size 1070680
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10900_06fc65f430d4b794da3e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10900_06fc65f430d4b794da3e.png
new file mode 100644
index 0000000000000000000000000000000000000000..ecc190ab3f780667c4966ceaf93c7b403b3cffa3
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10900_06fc65f430d4b794da3e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:06fc65f430d4b794da3e2fa50c9a9439e7d1448ad086c7274b833bfb83c8fde2
+size 548443
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10900_144684d76b8c2613cd75.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10900_144684d76b8c2613cd75.png
new file mode 100644
index 0000000000000000000000000000000000000000..4925336f0e38292b55ee98200845eec44363a97d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10900_144684d76b8c2613cd75.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:144684d76b8c2613cd75094a470a4853c98b7e2fe542e4d7292ace088a0da82b
+size 917905
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10900_9539646c7e8f161ecd9c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10900_9539646c7e8f161ecd9c.png
new file mode 100644
index 0000000000000000000000000000000000000000..ad728435b1ea4a2be6e768b074915427e1f1800b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10900_9539646c7e8f161ecd9c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9539646c7e8f161ecd9cccd77f3692f655bc0941e51515b0b88feb30729f78b8
+size 1190202
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10900_ea14283eed9f12ecc701.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10900_ea14283eed9f12ecc701.png
new file mode 100644
index 0000000000000000000000000000000000000000..9509dd8df1638e6f56ac12828de8785aa73beaa4
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_10900_ea14283eed9f12ecc701.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ea14283eed9f12ecc70195bd1ec71bc196decbf5aebcaf7c1c58b3073f699ac7
+size 342673
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1100_03354937601addd5d3bc.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1100_03354937601addd5d3bc.png
new file mode 100644
index 0000000000000000000000000000000000000000..8bb79fb845960375bf8550513eef6fc1af3c906c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1100_03354937601addd5d3bc.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:03354937601addd5d3bc500f2b16d2ac3b1b7cffd9a3447234c5f858e9403811
+size 583432
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1100_20bb46d0eb49fffc3294.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1100_20bb46d0eb49fffc3294.png
new file mode 100644
index 0000000000000000000000000000000000000000..004f0e825751d4663dba5ba0db730c8b71482e01
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1100_20bb46d0eb49fffc3294.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:20bb46d0eb49fffc3294270e2054b789972529cbd85dcc1c1274ea98972478e8
+size 511056
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1100_bea7cffd68e81ede167b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1100_bea7cffd68e81ede167b.png
new file mode 100644
index 0000000000000000000000000000000000000000..1309635f0418f0cb95e87a0d9af2fae852c036ee
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1100_bea7cffd68e81ede167b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:bea7cffd68e81ede167bc5b3b039567ac691fb64ca6305b29ebbfecd5d22aea6
+size 601944
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1100_f66e3e139144497726fb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1100_f66e3e139144497726fb.png
new file mode 100644
index 0000000000000000000000000000000000000000..e64a2b53d4bc93a6963121db041e8aa4a4a1a3ac
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1100_f66e3e139144497726fb.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f66e3e139144497726fb837c5fab3e24e4b463a9175f6082c2711de16ec2234c
+size 436348
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11100_ab03c5a0b805322f751c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11100_ab03c5a0b805322f751c.png
new file mode 100644
index 0000000000000000000000000000000000000000..75f9777e8ccdc410f71a46487ed2fa8837971ab5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11100_ab03c5a0b805322f751c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ab03c5a0b805322f751cf33ae9322bd2aafd4e79733fcb5c7d6fa0c22e119d0b
+size 783937
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11100_b68785f922d7c0472006.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11100_b68785f922d7c0472006.png
new file mode 100644
index 0000000000000000000000000000000000000000..3afa89129e98b3e0f6705a6b2ff06205364000f9
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11100_b68785f922d7c0472006.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b68785f922d7c0472006a21ec6e9ac9d54617b73c1dafca514c5a620c7a4c991
+size 359667
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11100_be552891b534d37d5e5d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11100_be552891b534d37d5e5d.png
new file mode 100644
index 0000000000000000000000000000000000000000..6408b9a38747b16e9a0bb663897ba01b652319a9
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11100_be552891b534d37d5e5d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:be552891b534d37d5e5d70b2331e125fc79fa9776fdcc8498810dce247f9c01d
+size 393674
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11100_dcc288937ba5a65c34b0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11100_dcc288937ba5a65c34b0.png
new file mode 100644
index 0000000000000000000000000000000000000000..284be50256e228aaa4cdd861df7107e768a6268c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11100_dcc288937ba5a65c34b0.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:dcc288937ba5a65c34b02fabc12b97e743a6c42b95462b2db6d4c25a53f36607
+size 819488
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11300_0ad8063efc6fccc4701a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11300_0ad8063efc6fccc4701a.png
new file mode 100644
index 0000000000000000000000000000000000000000..6b9732600561138a65ef8ff10e3c61c2a5421194
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11300_0ad8063efc6fccc4701a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0ad8063efc6fccc4701aad87ad6eee933d3aa309fe9cbb57800ce4091952ced9
+size 487038
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11300_30524c9e6cd6320445e5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11300_30524c9e6cd6320445e5.png
new file mode 100644
index 0000000000000000000000000000000000000000..d460f8dbc65b712195920543b2b585e2db507424
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11300_30524c9e6cd6320445e5.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:30524c9e6cd6320445e5b6c36771058994846ae187def0dc9a3af5b669bbdb27
+size 609184
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11300_880ba89fa2c48634c1e9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11300_880ba89fa2c48634c1e9.png
new file mode 100644
index 0000000000000000000000000000000000000000..6e6344d807e0d442765511a87a02f28e895f59cc
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11300_880ba89fa2c48634c1e9.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:880ba89fa2c48634c1e9ac6bea6245cc4eddcb9670c8d56b780f899cb5309d04
+size 859238
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11300_eb1bbff23fb8e4d05eaf.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11300_eb1bbff23fb8e4d05eaf.png
new file mode 100644
index 0000000000000000000000000000000000000000..e18114c1822643fbb5840f548f9c82732aefb550
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11300_eb1bbff23fb8e4d05eaf.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:eb1bbff23fb8e4d05eaf9917d3b2fa31bb723ca31464e1f334e57b6b27f8a936
+size 273246
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11500_3ea6d4f542534775ba48.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11500_3ea6d4f542534775ba48.png
new file mode 100644
index 0000000000000000000000000000000000000000..911b61e4de6e1b179266193ae6e9798bf978c502
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11500_3ea6d4f542534775ba48.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3ea6d4f542534775ba48c7d5f0ce22ff84df5992b1f193c374b8a23ab2f9774f
+size 440480
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11500_4f1b0e343ab85d458a36.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11500_4f1b0e343ab85d458a36.png
new file mode 100644
index 0000000000000000000000000000000000000000..fc44e649fa539aa93738c964c45cf0276b36c71f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11500_4f1b0e343ab85d458a36.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4f1b0e343ab85d458a365b222fef0389e26d760dcaaaa0ec2d8fe88ce0ce307c
+size 661614
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11500_84d47ba7ddd6a03b2eb9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11500_84d47ba7ddd6a03b2eb9.png
new file mode 100644
index 0000000000000000000000000000000000000000..272aa7ffb9b45c966f182e94289ecbeee1ac8ce7
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11500_84d47ba7ddd6a03b2eb9.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:84d47ba7ddd6a03b2eb91b287e68690ef9f2b890ef6de6bc304c5908e6756f85
+size 671252
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11500_eca29a91acbfd09cba85.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11500_eca29a91acbfd09cba85.png
new file mode 100644
index 0000000000000000000000000000000000000000..ff407ec8033dde589ede28fe2723b9ded183f838
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11500_eca29a91acbfd09cba85.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:eca29a91acbfd09cba8503f41f3f89bd5e0421b5527ca3703dc77d073ceee386
+size 244641
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11700_06b39451354bf2aebadc.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11700_06b39451354bf2aebadc.png
new file mode 100644
index 0000000000000000000000000000000000000000..69cdc7c26463aefb9b8306d0d4d2e01d46b9785a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11700_06b39451354bf2aebadc.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:06b39451354bf2aebadc53468a6525caf5e73b272af4fc15a3ba9b4d2670a7d3
+size 467913
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11700_2923ed76c4bca699d99a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11700_2923ed76c4bca699d99a.png
new file mode 100644
index 0000000000000000000000000000000000000000..2e62d8f0d91f0547a12a4d64a51f239d075bf45f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11700_2923ed76c4bca699d99a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2923ed76c4bca699d99a693af37477469acdabf4231f392dde6df1bc10f1c467
+size 733394
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11700_63b86b278def56f43888.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11700_63b86b278def56f43888.png
new file mode 100644
index 0000000000000000000000000000000000000000..ed71a9650a391088b24c916becb649db5fc04f6e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11700_63b86b278def56f43888.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:63b86b278def56f43888ecbd1750ba4030b07891ed3bf7b33f9daa0a3bf035f8
+size 599926
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11700_d024a86d62162052de47.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11700_d024a86d62162052de47.png
new file mode 100644
index 0000000000000000000000000000000000000000..13c81f9a50af7fc053da15e2f7ce99d2f9818fb2
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11700_d024a86d62162052de47.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d024a86d62162052de4770b8bdd4099ca92ea1032cd30536d137a942c11dec70
+size 1548044
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11900_307b08346d4593934eb7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11900_307b08346d4593934eb7.png
new file mode 100644
index 0000000000000000000000000000000000000000..aa4dd05b2b8f34caaa1c21edb0c767ae3ba01bb5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11900_307b08346d4593934eb7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:307b08346d4593934eb78e19ab0f8d2647791e4f8a81b84a3855b1f9f60fb3c4
+size 461487
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11900_b60f378b4b9cbb0e6c71.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11900_b60f378b4b9cbb0e6c71.png
new file mode 100644
index 0000000000000000000000000000000000000000..f8f153cab8489fa80f0d11785a9077175cbfa694
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11900_b60f378b4b9cbb0e6c71.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b60f378b4b9cbb0e6c71768622bbe9ac4ca0f58701ffd9950131b5728d69cd9c
+size 869555
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11900_c28cfd85edbc90a42d17.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11900_c28cfd85edbc90a42d17.png
new file mode 100644
index 0000000000000000000000000000000000000000..e440ae7cca7952c728898da6977fd7fd6f6ccc9e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11900_c28cfd85edbc90a42d17.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c28cfd85edbc90a42d172130007d893a815ca2485ea37a3fe413298da7b473c5
+size 1104703
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11900_c5743123640b1329d363.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11900_c5743123640b1329d363.png
new file mode 100644
index 0000000000000000000000000000000000000000..946cd22d76c5338b9ae08e71ec16f33a92411db8
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_11900_c5743123640b1329d363.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c5743123640b1329d3639b08cd38e0827429d38e540949e76c613723f6cc46cb
+size 521223
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12100_14d6d38dcfb237efb2ca.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12100_14d6d38dcfb237efb2ca.png
new file mode 100644
index 0000000000000000000000000000000000000000..b37aca3ae2cb4b84318b629965ef41e86b3b69ea
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12100_14d6d38dcfb237efb2ca.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:14d6d38dcfb237efb2ca2f042329fc8799bca843116854e115faa84d3cc38839
+size 1286383
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12100_687a1c8450e0c367af26.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12100_687a1c8450e0c367af26.png
new file mode 100644
index 0000000000000000000000000000000000000000..3e90b58441ccc0335418589cc7f48ece0048ddc6
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12100_687a1c8450e0c367af26.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:687a1c8450e0c367af2621daaec23168f8c11f8407cb15b2065ef4074b3ce50a
+size 854531
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12100_898f524c9b22bb60d205.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12100_898f524c9b22bb60d205.png
new file mode 100644
index 0000000000000000000000000000000000000000..84260ec8b625aceb40ba69749f0a6e0b08301cd9
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12100_898f524c9b22bb60d205.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:898f524c9b22bb60d20571873e3d4cfc9e8b93dea8c89e7349d46325e33f1207
+size 433234
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12100_d147a4f32cfa3809c3bf.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12100_d147a4f32cfa3809c3bf.png
new file mode 100644
index 0000000000000000000000000000000000000000..a6b8183ccb80cb26b6e23fda99eae005dd66dfbd
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12100_d147a4f32cfa3809c3bf.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d147a4f32cfa3809c3bf45932dc51921f84ac46fbeb64d8161ccaa7776b73aa8
+size 1161700
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12300_282f69082c3d2c036e4e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12300_282f69082c3d2c036e4e.png
new file mode 100644
index 0000000000000000000000000000000000000000..ca13a93fced2ef8ca0825663edbc6a2a167c1b84
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12300_282f69082c3d2c036e4e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:282f69082c3d2c036e4ef5d4b31eb4f35ab0e8487b5d294a950cef91346c671a
+size 357169
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12300_2ca1e2ed2af722e6ef44.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12300_2ca1e2ed2af722e6ef44.png
new file mode 100644
index 0000000000000000000000000000000000000000..18da13802ea370dc034c9a15287e9dd3b62de205
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12300_2ca1e2ed2af722e6ef44.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2ca1e2ed2af722e6ef44dcb0e215c77b398e51eda601ca673b074578a6fe2f6b
+size 363486
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12300_6b4b6c8c196d1ebb9205.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12300_6b4b6c8c196d1ebb9205.png
new file mode 100644
index 0000000000000000000000000000000000000000..8ca6a76db9569a48ab467c13b1407cf88ce65691
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12300_6b4b6c8c196d1ebb9205.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6b4b6c8c196d1ebb920596691a409557daae275e8f853f3ddea5beef4c659f81
+size 854347
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12300_9bb7d4e4ff57e305d720.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12300_9bb7d4e4ff57e305d720.png
new file mode 100644
index 0000000000000000000000000000000000000000..b4ef296f991e1c5ab0a78b77b9518e488f9b507d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12300_9bb7d4e4ff57e305d720.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9bb7d4e4ff57e305d720b615d71934329876c799a1c3270d09dde2b4344b8eb4
+size 983676
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12500_7421f46e125e7b9f9546.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12500_7421f46e125e7b9f9546.png
new file mode 100644
index 0000000000000000000000000000000000000000..1d2ce3b09faaf98d9bb3973c0a66c71b4503368d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12500_7421f46e125e7b9f9546.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7421f46e125e7b9f95464c457f490f11a8ca3ad379cb866c5f21d9bcdee1dd04
+size 427289
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12500_8044c831a6cadcb7ce14.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12500_8044c831a6cadcb7ce14.png
new file mode 100644
index 0000000000000000000000000000000000000000..3e37ec4c077147f0170aca8a1801bef0dc869aea
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12500_8044c831a6cadcb7ce14.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8044c831a6cadcb7ce1405ce67e56dea24b5ec60728fcdbd9fe6e9c580cf0b00
+size 553689
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12500_8b38bd268d19401454e8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12500_8b38bd268d19401454e8.png
new file mode 100644
index 0000000000000000000000000000000000000000..992fb8431de2933d0d7539b5487a7150e543497e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12500_8b38bd268d19401454e8.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8b38bd268d19401454e8c070c07aeb66f81406edfa03c34ff03bbd170ad6343c
+size 837507
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12500_ccbb7e69338688c29447.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12500_ccbb7e69338688c29447.png
new file mode 100644
index 0000000000000000000000000000000000000000..4ec6fe6b960f7bcc9224081c3bddfae40c715d95
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12500_ccbb7e69338688c29447.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ccbb7e69338688c29447a84fadfdbf6a664bccf0aa7661f216dcd57eb1a2b903
+size 1200513
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12700_1a5257e0f9d97c7a75cb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12700_1a5257e0f9d97c7a75cb.png
new file mode 100644
index 0000000000000000000000000000000000000000..aa0e6289e41264cd52f8903ad0fafdcf8a9e858b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12700_1a5257e0f9d97c7a75cb.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1a5257e0f9d97c7a75cb7b70d87329594e175445a276c4cb0dc7fde7bf5367dd
+size 753069
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12700_237daa18b25d7bc829ef.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12700_237daa18b25d7bc829ef.png
new file mode 100644
index 0000000000000000000000000000000000000000..c2692a8d777b51e6480dba22f740cd4e0750a57f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12700_237daa18b25d7bc829ef.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:237daa18b25d7bc829efa7ab2561898dbd349dd5c83ebe4fd0daa0e6c619f24a
+size 277217
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12700_2b7afeecf8867195a22e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12700_2b7afeecf8867195a22e.png
new file mode 100644
index 0000000000000000000000000000000000000000..86268868e03cbcf04302b6dbef53fcdcb82d0183
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12700_2b7afeecf8867195a22e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2b7afeecf8867195a22eecdce9084dfc4228ac864703a3455be75abc1f4c42bb
+size 1309585
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12700_4719bd5ecdafe0583f15.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12700_4719bd5ecdafe0583f15.png
new file mode 100644
index 0000000000000000000000000000000000000000..e044dc4336aa06abbdfefb4371e028dec31375bc
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12700_4719bd5ecdafe0583f15.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4719bd5ecdafe0583f158edb7d11b33b891937992462688111244bc5c440c3c3
+size 364570
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12900_10bc3802def488b9bb5a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12900_10bc3802def488b9bb5a.png
new file mode 100644
index 0000000000000000000000000000000000000000..6f14e61cf0f71e4596d65dbd180d29174fef9322
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12900_10bc3802def488b9bb5a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:10bc3802def488b9bb5ad498776a46ea6f75e02362058ff5f2446035aea8e4e9
+size 576279
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12900_5db0a86566899655c305.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12900_5db0a86566899655c305.png
new file mode 100644
index 0000000000000000000000000000000000000000..058e776dd364d6967165c9fad9e46db142a472aa
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12900_5db0a86566899655c305.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5db0a86566899655c30537b1886c07fb627dcb5ea1bcb1f06e5427218eccdba5
+size 1581879
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12900_62af1519452152eaafc0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12900_62af1519452152eaafc0.png
new file mode 100644
index 0000000000000000000000000000000000000000..49d5a05ef54441f92e0efa4bf090bf38e0408e89
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12900_62af1519452152eaafc0.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:62af1519452152eaafc0f73612a3f142e79f6b33039ad4be891181567c9791a9
+size 529124
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12900_86d28022091319aa8ee4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12900_86d28022091319aa8ee4.png
new file mode 100644
index 0000000000000000000000000000000000000000..3b1aafaad57c1eb023ffab48ab6e35bfe3f856c2
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_12900_86d28022091319aa8ee4.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:86d28022091319aa8ee489008a4e511bfc8bb7e9f2e3395982d58c869646be3e
+size 878532
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1300_390b5bb63cc1c3a4a1fb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1300_390b5bb63cc1c3a4a1fb.png
new file mode 100644
index 0000000000000000000000000000000000000000..2419b98e1ff198c22833c4499958dfe64c0e7cfe
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1300_390b5bb63cc1c3a4a1fb.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:390b5bb63cc1c3a4a1fbf7ae7e6a75bc3858600470262778f5a558c3d88bf027
+size 138974
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1300_48dcd8560bf6d4012b1a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1300_48dcd8560bf6d4012b1a.png
new file mode 100644
index 0000000000000000000000000000000000000000..c2f0416777703929eda5a36959af3b0d0c899823
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1300_48dcd8560bf6d4012b1a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:48dcd8560bf6d4012b1afec16b6c2d90cd9c1a871763a6dbaf416c5090ea1ff0
+size 123603
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1300_97674b3c12d23c75727b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1300_97674b3c12d23c75727b.png
new file mode 100644
index 0000000000000000000000000000000000000000..f90a23c5fc260517fac01b934d35d9a3f0d679d8
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1300_97674b3c12d23c75727b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:97674b3c12d23c75727bab66517beb3bf491d01df1cb846e43e26c461f844374
+size 134157
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1300_e4ea1f7730f3f017bb4d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1300_e4ea1f7730f3f017bb4d.png
new file mode 100644
index 0000000000000000000000000000000000000000..1b53d0894b85370397df9a0f3e84558d85bf1d73
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1300_e4ea1f7730f3f017bb4d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e4ea1f7730f3f017bb4d35fc0f1197935f46abae0ad130e9786843c3851b7579
+size 138957
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13100_116e20c867494779b3ac.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13100_116e20c867494779b3ac.png
new file mode 100644
index 0000000000000000000000000000000000000000..b88ca63e114cb617070516ef60d7e5143f341234
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13100_116e20c867494779b3ac.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:116e20c867494779b3ace55dd5253f33241af656f93328aac0fdd93f99ab873a
+size 351824
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13100_4b5b9ae001b0fac6e3ca.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13100_4b5b9ae001b0fac6e3ca.png
new file mode 100644
index 0000000000000000000000000000000000000000..df41fc9fd7d4ab7550635666940ce7384f64a0bd
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13100_4b5b9ae001b0fac6e3ca.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4b5b9ae001b0fac6e3ca735d0a2bce50a2118f22f23d3a1a8ee8a2622e6bb2ed
+size 543658
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13100_71f4057d6278ab8ffaf5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13100_71f4057d6278ab8ffaf5.png
new file mode 100644
index 0000000000000000000000000000000000000000..774e1fd981ca86cbdf990cf90b7fed71c51fe724
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13100_71f4057d6278ab8ffaf5.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:71f4057d6278ab8ffaf500ea8ad37c1d35fd75452f7a302722737104993c35fd
+size 1219244
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13100_8c7a965b73f7803129bc.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13100_8c7a965b73f7803129bc.png
new file mode 100644
index 0000000000000000000000000000000000000000..ddfb5bb8ea4b108909cfcc443da50b1064e33879
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13100_8c7a965b73f7803129bc.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8c7a965b73f7803129bc9ab24e0f1124d87ee2764876da35966c0971da0d724c
+size 427914
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13300_2c193b63cf45d0987add.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13300_2c193b63cf45d0987add.png
new file mode 100644
index 0000000000000000000000000000000000000000..28661fa7571e0c4650032e9de365f70e3027c2be
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13300_2c193b63cf45d0987add.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2c193b63cf45d0987add92c2c949658415b45498d3040809ad78e55b7cf394b1
+size 1365851
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13300_3c6ac0c83665e6a74cab.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13300_3c6ac0c83665e6a74cab.png
new file mode 100644
index 0000000000000000000000000000000000000000..3f014904d82799d309a25a174ac3f774b6cee53a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13300_3c6ac0c83665e6a74cab.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3c6ac0c83665e6a74cab5ac921859e2dee5a9e5c7f3f033df3f28af84ec65974
+size 328059
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13300_4499ad257b2df27a3704.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13300_4499ad257b2df27a3704.png
new file mode 100644
index 0000000000000000000000000000000000000000..b80f55fb983a5bea1f00af2a7e61c58de5282b5d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13300_4499ad257b2df27a3704.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4499ad257b2df27a3704816872d8dc02b3d376d498dfe599024c115d23ef9049
+size 880253
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13300_55d0faa28416085bcf29.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13300_55d0faa28416085bcf29.png
new file mode 100644
index 0000000000000000000000000000000000000000..0f68ec681cc6dccd397296c348091b627d57877a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13300_55d0faa28416085bcf29.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:55d0faa28416085bcf2930e635e7507db060cf20fecfaf19313efe0ca8878951
+size 724575
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13500_0d463fe8d2bc1aed5c02.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13500_0d463fe8d2bc1aed5c02.png
new file mode 100644
index 0000000000000000000000000000000000000000..dad5ac4cfbdf5a0ef7acc9bdb42932f83d35f6b5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13500_0d463fe8d2bc1aed5c02.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0d463fe8d2bc1aed5c020575a4ff48af25cf956bf172acd4f829228ebe8ee447
+size 992676
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13500_ae7db045782fc328a076.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13500_ae7db045782fc328a076.png
new file mode 100644
index 0000000000000000000000000000000000000000..e4d0620fd650bf4689f4a37c64da44a2525f0534
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13500_ae7db045782fc328a076.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ae7db045782fc328a07694a3764fe238f29f638702c4a686978b297ad2a2e21b
+size 501048
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13500_c60aae53101e7233d2a3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13500_c60aae53101e7233d2a3.png
new file mode 100644
index 0000000000000000000000000000000000000000..67c04d36d79e7059daec7cff3e307ffa57e50958
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13500_c60aae53101e7233d2a3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c60aae53101e7233d2a3ea22aeb385fd4ceb4ddc36f0a21135ecd147d7e149ba
+size 540246
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13500_c9ec546296d200b78c99.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13500_c9ec546296d200b78c99.png
new file mode 100644
index 0000000000000000000000000000000000000000..8e4154c36f0548645b741c82d8f4698821674f2e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13500_c9ec546296d200b78c99.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c9ec546296d200b78c993192451eb999d8f60731361d2c0dfdddafb3267ee919
+size 759764
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13700_188a4e003796d67365f9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13700_188a4e003796d67365f9.png
new file mode 100644
index 0000000000000000000000000000000000000000..61e0c6a151c5dc52e2729e262a15623d6d320612
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13700_188a4e003796d67365f9.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:188a4e003796d67365f99f3954d032ea9237dc545779187c2e1a171e4cecae37
+size 438464
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13700_33a1e395cd5e974ef1f2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13700_33a1e395cd5e974ef1f2.png
new file mode 100644
index 0000000000000000000000000000000000000000..6da8539d03e7d1e734d2c7931b59aadb1b27e70b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13700_33a1e395cd5e974ef1f2.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:33a1e395cd5e974ef1f2d716a91c516b496842374acf2a9db9fc07cee9d26417
+size 794151
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13700_8c8ea07ef78f6b1a789c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13700_8c8ea07ef78f6b1a789c.png
new file mode 100644
index 0000000000000000000000000000000000000000..4ffc7f397f672ae2c5368ec70c3afe5e6edc0da6
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13700_8c8ea07ef78f6b1a789c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8c8ea07ef78f6b1a789c5710988a421c3be32105e3d075632037f65a88ffeb00
+size 333385
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13700_a24093849aa388844712.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13700_a24093849aa388844712.png
new file mode 100644
index 0000000000000000000000000000000000000000..8dfe0dae75870913f25d215c42b5a807313fa98f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13700_a24093849aa388844712.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a24093849aa38884471229d0297d012745cc09d16533f27d3de2b530e9937397
+size 810359
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13900_35d7bebd77dfba4d18ae.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13900_35d7bebd77dfba4d18ae.png
new file mode 100644
index 0000000000000000000000000000000000000000..fe34adef9df8c710c03e260cce393c93233c91a7
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13900_35d7bebd77dfba4d18ae.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:35d7bebd77dfba4d18ae6611f18d92c9c00b4728e03425f2ecb83cb01878a664
+size 624782
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13900_94b3b077a38f51307b4b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13900_94b3b077a38f51307b4b.png
new file mode 100644
index 0000000000000000000000000000000000000000..62f255ea6dce3c5b74c08a5d0a2e1a51d643a4ac
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13900_94b3b077a38f51307b4b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:94b3b077a38f51307b4b009f6f879385ade4cfc3324a89d7f50ebda7c1babe15
+size 616581
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13900_bd20cc7c67e54472f3ec.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13900_bd20cc7c67e54472f3ec.png
new file mode 100644
index 0000000000000000000000000000000000000000..4fb3f9a52647aa72e04ca0afb8b00c654751bc80
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13900_bd20cc7c67e54472f3ec.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:bd20cc7c67e54472f3ecc5aeb9cca3336123d2cc00d8b146330a187fa0baa7a6
+size 1262317
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13900_f7b7f95c649aa105a004.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13900_f7b7f95c649aa105a004.png
new file mode 100644
index 0000000000000000000000000000000000000000..e1d676dbf8ee5e20465982301ff39cb70e5de186
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_13900_f7b7f95c649aa105a004.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f7b7f95c649aa105a0045101024b93bf8708e8952d05413d4559bec4caa1f7f8
+size 576831
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14100_3fc36d751ee28996446e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14100_3fc36d751ee28996446e.png
new file mode 100644
index 0000000000000000000000000000000000000000..3e0d03b40d8450355f2db0c1011a3c11d7b9223b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14100_3fc36d751ee28996446e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3fc36d751ee28996446ecb360dbe5f049ece9d5679e3178bad85b20a390bc143
+size 1183003
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14100_553ec56dd80f874f74cf.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14100_553ec56dd80f874f74cf.png
new file mode 100644
index 0000000000000000000000000000000000000000..e1a1b5a81bf2624ae2bcf019132fefe61191fceb
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14100_553ec56dd80f874f74cf.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:553ec56dd80f874f74cf3571eca57265848bec366cec45d598e4282e6782764c
+size 413073
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14100_d0fb9b71e06888ddef14.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14100_d0fb9b71e06888ddef14.png
new file mode 100644
index 0000000000000000000000000000000000000000..85c238b8d0826f5a3b84452e5f34d2352d41ebdb
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14100_d0fb9b71e06888ddef14.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d0fb9b71e06888ddef149e1fc27d50dc357c0788ccad71fa2009173a06e130e2
+size 560426
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14100_ef89935d8c9054480dbc.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14100_ef89935d8c9054480dbc.png
new file mode 100644
index 0000000000000000000000000000000000000000..85674503e7584d56402aa2a8fe77dc582a15d290
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14100_ef89935d8c9054480dbc.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ef89935d8c9054480dbc453c5c0d6eb503b8db641038328a8b2637133adfee94
+size 990762
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14300_1e9f4dbf78a3d8c8745d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14300_1e9f4dbf78a3d8c8745d.png
new file mode 100644
index 0000000000000000000000000000000000000000..b7ef121d81a904ed645891d0f15fbab6f75eb581
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14300_1e9f4dbf78a3d8c8745d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1e9f4dbf78a3d8c8745db60ca2da7525a3dbc25426f1ced57a088c45aea31cc7
+size 726861
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14300_3d1b15b57f0db788ac87.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14300_3d1b15b57f0db788ac87.png
new file mode 100644
index 0000000000000000000000000000000000000000..666f57a032d8b9de8f1fafbf5f04d46726d3e8bc
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14300_3d1b15b57f0db788ac87.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3d1b15b57f0db788ac87518cfb5353e8efb277d5f4f210d4e24c03fce4eed421
+size 998557
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14300_4794401d4b40b97e9eb0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14300_4794401d4b40b97e9eb0.png
new file mode 100644
index 0000000000000000000000000000000000000000..125d6e0aaa51d2da13a1611a080207d9fe21781b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14300_4794401d4b40b97e9eb0.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4794401d4b40b97e9eb0e6dfae0d90ef01e0c0e8756d7f4022a32dabef7c5061
+size 575993
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14300_cdaf308e8e4cbecec58d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14300_cdaf308e8e4cbecec58d.png
new file mode 100644
index 0000000000000000000000000000000000000000..3b84b93ade70d9e1cc15a72bc421673018006c59
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14300_cdaf308e8e4cbecec58d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:cdaf308e8e4cbecec58d67898a96b2954b0dfe1997676acb8865646637e2ccbf
+size 1185845
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14500_275499c1f8cacc3b8b25.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14500_275499c1f8cacc3b8b25.png
new file mode 100644
index 0000000000000000000000000000000000000000..df58085021f46f8284c3b4d5a2ce26001e261cf6
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14500_275499c1f8cacc3b8b25.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:275499c1f8cacc3b8b250fb870dc1b7a07fbb924c9931cbadb286082c6fbc809
+size 402855
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14500_9012485c2062538ba9f2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14500_9012485c2062538ba9f2.png
new file mode 100644
index 0000000000000000000000000000000000000000..e2eda54b6da1d45e1edf8aa93148f881f8247ca5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14500_9012485c2062538ba9f2.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9012485c2062538ba9f25e76f1766a604ab7579d63ffb4124648461bd3d5bdc9
+size 1732472
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14500_e17d7eeb759b0849245f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14500_e17d7eeb759b0849245f.png
new file mode 100644
index 0000000000000000000000000000000000000000..7de54cf0324e96ab2679d86edcfd574b897f734d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14500_e17d7eeb759b0849245f.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e17d7eeb759b0849245f043f69b26e52bbf26aa20f89e21c5a08196b37c961bf
+size 400458
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14500_ea15022e30f1f26b2292.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14500_ea15022e30f1f26b2292.png
new file mode 100644
index 0000000000000000000000000000000000000000..672a8119341956d6317b3a776bd16461908b46fa
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14500_ea15022e30f1f26b2292.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ea15022e30f1f26b2292572567cad60bcfe6c3b7d2c82e1fd71e7f756a896217
+size 740095
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14700_99bf7cea96e6c51fef25.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14700_99bf7cea96e6c51fef25.png
new file mode 100644
index 0000000000000000000000000000000000000000..d7c9e69cc47ee9d01306af15786b59d179d84a57
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14700_99bf7cea96e6c51fef25.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:99bf7cea96e6c51fef256afb90b41faa6e2353232d1a423a1deb03625f8b0d22
+size 466035
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14700_a2443de5b5d06b53e316.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14700_a2443de5b5d06b53e316.png
new file mode 100644
index 0000000000000000000000000000000000000000..0d41574de46e8db87f28e118f222c668c49bfc70
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14700_a2443de5b5d06b53e316.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a2443de5b5d06b53e3168f2f24035f44c492c3b6905b21678d4c9f35e62eb687
+size 959399
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14700_b2fd062f0ee1078fd060.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14700_b2fd062f0ee1078fd060.png
new file mode 100644
index 0000000000000000000000000000000000000000..007a0a9b8957fabf93ca680ecd2dbad5ce571f4f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14700_b2fd062f0ee1078fd060.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b2fd062f0ee1078fd060c27d224cb369452564a278e6b839b0b887f7b0ba39ef
+size 784502
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14700_c57f21aa4b2fecc55eb9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14700_c57f21aa4b2fecc55eb9.png
new file mode 100644
index 0000000000000000000000000000000000000000..c4b28284fd813d8ee5585d65fba08cb71f8397cb
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14700_c57f21aa4b2fecc55eb9.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c57f21aa4b2fecc55eb9ce3cdadc0ff279309e8e0b97bd87751f44b1b4234dfe
+size 477974
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14900_12bbcfeb306b5dc35194.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14900_12bbcfeb306b5dc35194.png
new file mode 100644
index 0000000000000000000000000000000000000000..bba1e259f32b460811726246ee911494ee35581e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14900_12bbcfeb306b5dc35194.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:12bbcfeb306b5dc3519441f3eeae4be5179b322224d5a9bd09d387d4855c47b5
+size 985601
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14900_a766b51c777ef8007bb9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14900_a766b51c777ef8007bb9.png
new file mode 100644
index 0000000000000000000000000000000000000000..9f190f0263cf53d1e8e1ff253721301fd9c51841
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14900_a766b51c777ef8007bb9.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a766b51c777ef8007bb9e95e9811898407c469195c7ca7b16f028676deb58e3b
+size 556590
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14900_b42ae326970a3729cff0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14900_b42ae326970a3729cff0.png
new file mode 100644
index 0000000000000000000000000000000000000000..ba7ad90e97aa637320f122e24631b8d07adaf4bd
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14900_b42ae326970a3729cff0.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b42ae326970a3729cff0917459cfa7b410cc45a7c4605ea5aa84fd19df81f82c
+size 579407
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14900_e6dffc4155697b7a5006.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14900_e6dffc4155697b7a5006.png
new file mode 100644
index 0000000000000000000000000000000000000000..0bf5d08cb3f2c848f67f50f3dfbbf6e6203a7038
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_14900_e6dffc4155697b7a5006.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e6dffc4155697b7a50062a1c558b96119063fdad1a91206de1318fb33edb2b7d
+size 672757
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1500_1e867107b29336ecec92.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1500_1e867107b29336ecec92.png
new file mode 100644
index 0000000000000000000000000000000000000000..b7256dfcf2fe8f86825777e5d2de3ac8350affd3
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1500_1e867107b29336ecec92.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1e867107b29336ecec926f87ee7acd3026c4eac12247c2196ce3ad986da5392e
+size 178222
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1500_1ee71bc98819e08841d4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1500_1ee71bc98819e08841d4.png
new file mode 100644
index 0000000000000000000000000000000000000000..938136e6b403e4a4c731612f32bf3b7f940cbf82
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1500_1ee71bc98819e08841d4.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1ee71bc98819e08841d4026e8c33c952a4f32066e382bd259071168308af1044
+size 435487
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1500_997ecd882ba6e33319b7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1500_997ecd882ba6e33319b7.png
new file mode 100644
index 0000000000000000000000000000000000000000..92ae87a57dfdbb301a1a35fb9276e44aa91677a4
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1500_997ecd882ba6e33319b7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:997ecd882ba6e33319b7096bbb93522782871cf11d7a5fa63f2d7cf691906838
+size 248925
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1500_c838eade5b2b6d6a78f4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1500_c838eade5b2b6d6a78f4.png
new file mode 100644
index 0000000000000000000000000000000000000000..aa342bd80f43d472231c916ade60189d809d3da5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1500_c838eade5b2b6d6a78f4.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c838eade5b2b6d6a78f4ab3bc42cfc0ebf9609ae59d256f9dd56126afc3a1297
+size 188407
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15100_747fd5fd78d76d732b31.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15100_747fd5fd78d76d732b31.png
new file mode 100644
index 0000000000000000000000000000000000000000..8d0ecf2d6cbda713a9de70a815ee2e3227dd5b87
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15100_747fd5fd78d76d732b31.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:747fd5fd78d76d732b319c0c2be023b72060c1bd95fcf5f09fd6ce81083b1fbe
+size 1256764
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15100_aeef63256e79fc619bf3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15100_aeef63256e79fc619bf3.png
new file mode 100644
index 0000000000000000000000000000000000000000..89997fcf8d35be99190a0b187422df219c3e11af
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15100_aeef63256e79fc619bf3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:aeef63256e79fc619bf340dde71dbc5d98fc7f963b06c83b4b53c5c9232213c5
+size 524567
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15100_ba9b9dfe32be3c0915c0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15100_ba9b9dfe32be3c0915c0.png
new file mode 100644
index 0000000000000000000000000000000000000000..9822840c311be25233a81066243e4b5fb60cd2f6
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15100_ba9b9dfe32be3c0915c0.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ba9b9dfe32be3c0915c0894623b4e5e0905e41a82d03b8bdc1d22a3129611041
+size 896036
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15100_f096ac74b8415bcd6557.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15100_f096ac74b8415bcd6557.png
new file mode 100644
index 0000000000000000000000000000000000000000..cb92f2087a9d7fd5766888e5c5f0e17fbf8939f7
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15100_f096ac74b8415bcd6557.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f096ac74b8415bcd655772dafe7a5e9b191eb72f5a78670274758d363fc72c0d
+size 421346
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15300_36f618d5780812d90bfc.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15300_36f618d5780812d90bfc.png
new file mode 100644
index 0000000000000000000000000000000000000000..50f239b341b1e33ec07b594d7f43698f17a52426
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15300_36f618d5780812d90bfc.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:36f618d5780812d90bfc908982fb017b7e065ae6571d1d402366f8a136b76280
+size 504639
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15300_505ec817e4b8034db4a7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15300_505ec817e4b8034db4a7.png
new file mode 100644
index 0000000000000000000000000000000000000000..2924c7e73552318d2c6f07fa1451f736c07983ff
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15300_505ec817e4b8034db4a7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:505ec817e4b8034db4a79433f9ecfcfc3e27ca9e3a685d95e5dc602a40cd2bdb
+size 717552
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15300_699a8038b151e48b6be9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15300_699a8038b151e48b6be9.png
new file mode 100644
index 0000000000000000000000000000000000000000..be5711bf56f031411f9dfa26c8397261296c4ae2
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15300_699a8038b151e48b6be9.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:699a8038b151e48b6be9be525703bdac74bef61a217c51885ca0985b2321f721
+size 476777
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15300_dc2c3af8fedffa5fdcda.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15300_dc2c3af8fedffa5fdcda.png
new file mode 100644
index 0000000000000000000000000000000000000000..e1424d7174753c005119aff02adaaebd33198469
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15300_dc2c3af8fedffa5fdcda.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:dc2c3af8fedffa5fdcda83b39253a3aa1f5c800edda877110bbba15cdc817cdb
+size 840062
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15500_376c35f70dd2647c7979.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15500_376c35f70dd2647c7979.png
new file mode 100644
index 0000000000000000000000000000000000000000..87c0797e50304af656032a3d8efef01d7f4d6979
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15500_376c35f70dd2647c7979.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:376c35f70dd2647c797983eebda12c6660a913a8cd446f5c6e2b2c07d69f45ce
+size 453435
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15500_898065a1a84fc0ca90e3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15500_898065a1a84fc0ca90e3.png
new file mode 100644
index 0000000000000000000000000000000000000000..c7756047c067ca9dbdab32a0581b7d4402a29736
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15500_898065a1a84fc0ca90e3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:898065a1a84fc0ca90e32f759556a1493e44a966a8edfbd4d304f5cfbf9e7b25
+size 766751
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15500_9b2d0c8a99330e1a6957.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15500_9b2d0c8a99330e1a6957.png
new file mode 100644
index 0000000000000000000000000000000000000000..63a69a09d6283aa8a4014b086373f0400db07394
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15500_9b2d0c8a99330e1a6957.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9b2d0c8a99330e1a6957008bf5bbc3b3e0beebdd1808cbb491ac05d9cc04cb61
+size 484957
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15500_ed061a289fc9260e4179.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15500_ed061a289fc9260e4179.png
new file mode 100644
index 0000000000000000000000000000000000000000..200611f312859fcee79e4d4e98d331a06d6912a8
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15500_ed061a289fc9260e4179.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ed061a289fc9260e4179339c4e699f983c0b84e5b1c5401c4acf32065ef14868
+size 855364
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15700_140226d4c2c8869ffc01.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15700_140226d4c2c8869ffc01.png
new file mode 100644
index 0000000000000000000000000000000000000000..2231d55948d15947727a9fd38045b13b74318e3e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15700_140226d4c2c8869ffc01.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:140226d4c2c8869ffc018fac50d785006db2081b44505fe6369d28f48fd34b20
+size 1252747
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15700_696184ed59e69f293742.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15700_696184ed59e69f293742.png
new file mode 100644
index 0000000000000000000000000000000000000000..163ecd837f95e28ba72cf3ee1f8b9c37996467a7
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15700_696184ed59e69f293742.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:696184ed59e69f293742fb985032dceddcd5c6a2b0c5b4eaea8157b8411dfe43
+size 588666
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15700_70d4c74b84ebb1db27b6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15700_70d4c74b84ebb1db27b6.png
new file mode 100644
index 0000000000000000000000000000000000000000..8946d7f71e2b93de9521ed9265778b658ff0db1d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15700_70d4c74b84ebb1db27b6.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:70d4c74b84ebb1db27b674fafd321e5c451d70e9f47fdd798b97a75bb0f19bf5
+size 568527
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15700_c4c6afbd9c263b406783.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15700_c4c6afbd9c263b406783.png
new file mode 100644
index 0000000000000000000000000000000000000000..bb348718d60e1141abe8f83b82e5d4f809304018
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15700_c4c6afbd9c263b406783.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c4c6afbd9c263b4067831aad9d3d2f7895dd6d40544ac030e96044dd1a998fdd
+size 752229
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15900_2b10d787bae0ee849277.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15900_2b10d787bae0ee849277.png
new file mode 100644
index 0000000000000000000000000000000000000000..4d3d02d2917ece416295ac50a7c1a1b05db1b5d3
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15900_2b10d787bae0ee849277.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2b10d787bae0ee849277579b9ccfd40ac43883b959ef57de4745e1d4ce8f783d
+size 1487306
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15900_4853d16b12da7bcd81c4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15900_4853d16b12da7bcd81c4.png
new file mode 100644
index 0000000000000000000000000000000000000000..d948949e13c4bf8378a06e5de1e5e6338c3a3533
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15900_4853d16b12da7bcd81c4.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4853d16b12da7bcd81c4e5f3c3cd846637c071130fcc3d92cf4e083106e12e4f
+size 549603
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15900_8c6a8dca8c2755204b4d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15900_8c6a8dca8c2755204b4d.png
new file mode 100644
index 0000000000000000000000000000000000000000..3b883a266e97266f432779b30781ac6ff6ccf222
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15900_8c6a8dca8c2755204b4d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8c6a8dca8c2755204b4de7d5779642cd0fde7f93e9fc6f5a69e1bba43a66a04d
+size 755761
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15900_dfbd7be8094af6611729.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15900_dfbd7be8094af6611729.png
new file mode 100644
index 0000000000000000000000000000000000000000..3b2f331636b5fcb6ea6e04c379b8db414c532bf9
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_15900_dfbd7be8094af6611729.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:dfbd7be8094af661172955dd50ca78f03b2cf4025186cee40e5ba07ddb174a9a
+size 651409
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16100_4aefd057c176a47af004.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16100_4aefd057c176a47af004.png
new file mode 100644
index 0000000000000000000000000000000000000000..d79987859cb23096c9fbda76743f299bd231507a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16100_4aefd057c176a47af004.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4aefd057c176a47af0042ce5a3fa352437c9a153bb85e00e75c2f0082eb9f354
+size 475528
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16100_d94786effcd745d1ea87.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16100_d94786effcd745d1ea87.png
new file mode 100644
index 0000000000000000000000000000000000000000..258f4974ee0a046ef12168b71a629a7ec1c0472f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16100_d94786effcd745d1ea87.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d94786effcd745d1ea870b3fe489c9d4debafc051f2a9543aa13cd37bfda9f56
+size 516796
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16100_f6cbcf30b56dd781a190.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16100_f6cbcf30b56dd781a190.png
new file mode 100644
index 0000000000000000000000000000000000000000..cd08f612f4418e1d5b6f0f87574c7328b912b9fa
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16100_f6cbcf30b56dd781a190.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f6cbcf30b56dd781a19082ee5544079376a6aba7594f671ebccafec163619f59
+size 1050397
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16100_fb1fc01a9c7dc0342be5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16100_fb1fc01a9c7dc0342be5.png
new file mode 100644
index 0000000000000000000000000000000000000000..5a357db37ad45b03f9876a62f6a949e3329324eb
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16100_fb1fc01a9c7dc0342be5.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:fb1fc01a9c7dc0342be5e1bf0605ab6352cef1a52f59331485388b2d60ca0872
+size 1034374
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16300_892ed9b2c56c05a07ce3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16300_892ed9b2c56c05a07ce3.png
new file mode 100644
index 0000000000000000000000000000000000000000..b788f85514a07423e630f3c2a769af5ecd1006a4
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16300_892ed9b2c56c05a07ce3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:892ed9b2c56c05a07ce3940625b6ae3f2d12513acd869521af61ced3777e5662
+size 1084509
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16300_8dc82528346311bc5ca2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16300_8dc82528346311bc5ca2.png
new file mode 100644
index 0000000000000000000000000000000000000000..374312c2b15f70a9aca0a74a332c19ec5a66592f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16300_8dc82528346311bc5ca2.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8dc82528346311bc5ca2ea3e79edd553ee7c1c35b9b82fdd3eff76df56865cf7
+size 649086
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16300_dda56791e6ca1978b789.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16300_dda56791e6ca1978b789.png
new file mode 100644
index 0000000000000000000000000000000000000000..e53ecc59a610e834a4f08e3a5fa8a96907f86307
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16300_dda56791e6ca1978b789.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:dda56791e6ca1978b7897b5de14ab9a7b0023c9f1b5680e13347c00502812a78
+size 511899
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16300_f4c4591ef4ae1de4262b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16300_f4c4591ef4ae1de4262b.png
new file mode 100644
index 0000000000000000000000000000000000000000..e09f475bdc5ffc9cb662811973e4d8426deacf57
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16300_f4c4591ef4ae1de4262b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f4c4591ef4ae1de4262bf7968c0e8d411d730cf085efcc7b1fbf24551bd6e9c3
+size 353473
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16500_08790ae1695e6f0ff3a4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16500_08790ae1695e6f0ff3a4.png
new file mode 100644
index 0000000000000000000000000000000000000000..1800b562ad5bb0764c05ad4973fd0f24426fffdd
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16500_08790ae1695e6f0ff3a4.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:08790ae1695e6f0ff3a4b6a925820007e5b4af684f9685d26b000527f66a8b89
+size 1535148
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16500_c63296ce07e2a42001ec.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16500_c63296ce07e2a42001ec.png
new file mode 100644
index 0000000000000000000000000000000000000000..d507650d2400fc0bbc20c8ed06fd5c955a513853
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16500_c63296ce07e2a42001ec.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c63296ce07e2a42001ec6531b06f9c88b6a2add4bdf1fce168b28d87cfdebcea
+size 455667
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16500_c88c8409b57c767858d2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16500_c88c8409b57c767858d2.png
new file mode 100644
index 0000000000000000000000000000000000000000..eff4654aecd75ee76dcb6e8afe3cbd41c570a4b3
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16500_c88c8409b57c767858d2.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c88c8409b57c767858d2115ab9ce175b05cea3ee6f4c93b2b240e9de887d5f29
+size 532707
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16500_e4f06e37c7d97e796334.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16500_e4f06e37c7d97e796334.png
new file mode 100644
index 0000000000000000000000000000000000000000..c78e0fcaab4834064077b920d030e4dc30855580
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16500_e4f06e37c7d97e796334.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e4f06e37c7d97e796334168bd0760822d2a1eb114d43b5c6a38a116c46a5f8b5
+size 914048
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16700_7095204cc14b90199137.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16700_7095204cc14b90199137.png
new file mode 100644
index 0000000000000000000000000000000000000000..77edcdcad5592e8bfb065d492dd799be24a58e28
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16700_7095204cc14b90199137.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7095204cc14b90199137a8aa6e7cd9aa9fca18ad0a9173fbe67bcfd0f6df68ef
+size 325631
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16700_9ea516029865fcd823ce.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16700_9ea516029865fcd823ce.png
new file mode 100644
index 0000000000000000000000000000000000000000..75a278d82421d6e066f33845e6e663c51cdf462e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16700_9ea516029865fcd823ce.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9ea516029865fcd823cef331a3670e3b3497df857382804b373a55c505335ef8
+size 935310
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16700_a4d700c6b30b820bc9ad.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16700_a4d700c6b30b820bc9ad.png
new file mode 100644
index 0000000000000000000000000000000000000000..f4c4dfcf96d70c3d664d6dc8c43941ce9e71bd1c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16700_a4d700c6b30b820bc9ad.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a4d700c6b30b820bc9ad77d6b30f2757891f1c40c1893daef6adadf3a6840a99
+size 860441
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16700_efdff70455e7f2f3ffce.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16700_efdff70455e7f2f3ffce.png
new file mode 100644
index 0000000000000000000000000000000000000000..3d19cc6ad8d19d0851b3114c92848d3221c6a47a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16700_efdff70455e7f2f3ffce.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:efdff70455e7f2f3ffce559bfc512d6673d1efd3ce2610f24dc9998f466b5824
+size 297712
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16900_2e17d70dc9ce3960db32.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16900_2e17d70dc9ce3960db32.png
new file mode 100644
index 0000000000000000000000000000000000000000..82d7329d90e46d2ef36359d62acae3152f7a5646
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16900_2e17d70dc9ce3960db32.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2e17d70dc9ce3960db3275a09a0fa9be64398092cf6e518fe60089d90de19ca3
+size 637120
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16900_3c58f98f653a91397138.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16900_3c58f98f653a91397138.png
new file mode 100644
index 0000000000000000000000000000000000000000..d5f037943227fe1127418de68a44018522a536bf
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16900_3c58f98f653a91397138.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3c58f98f653a913971383ac1789e8a4e3f72cffff5ee1eb49a9bc88137521b6f
+size 1733701
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16900_6493e79ef18df832ab30.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16900_6493e79ef18df832ab30.png
new file mode 100644
index 0000000000000000000000000000000000000000..f7d1a4d6f295b6693eb1a5e82054b17ca2387d47
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16900_6493e79ef18df832ab30.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6493e79ef18df832ab30761343142c83069958706299dfd120bd578645bbf69e
+size 586949
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16900_7e0bead318d191e58728.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16900_7e0bead318d191e58728.png
new file mode 100644
index 0000000000000000000000000000000000000000..e6a55d19d2a17acfeaeba40aa9139f6665018803
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_16900_7e0bead318d191e58728.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7e0bead318d191e58728ca8eafefcb181e7126c68e21fb17ea253118f611b40e
+size 509057
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1700_5e4ba2ad94bacb040627.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1700_5e4ba2ad94bacb040627.png
new file mode 100644
index 0000000000000000000000000000000000000000..aadf228fabcbc444e74e8cb55b79224e5fc781fd
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1700_5e4ba2ad94bacb040627.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1700_966b9caa4d6da5a2c3c2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1700_966b9caa4d6da5a2c3c2.png
new file mode 100644
index 0000000000000000000000000000000000000000..2708f03941eb2c739ba6a65c1fb23e927c0ecf91
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1700_966b9caa4d6da5a2c3c2.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:966b9caa4d6da5a2c3c22f7dccf55f4a72c0350879a18fe6fc5b18d814a74d3a
+size 137124
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1700_ac4d7a431a13a2f34af0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1700_ac4d7a431a13a2f34af0.png
new file mode 100644
index 0000000000000000000000000000000000000000..1e4f0fd2a9e589a4504409443c4b250e7d87be1a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1700_ac4d7a431a13a2f34af0.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ac4d7a431a13a2f34af0f3146ca79434f76f146a75de466d2b2aa50ee740d3d2
+size 139169
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1700_fb76861bfd7cfaef1cf8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1700_fb76861bfd7cfaef1cf8.png
new file mode 100644
index 0000000000000000000000000000000000000000..5a634888e579300417168c0cc103eb990be4cc2b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1700_fb76861bfd7cfaef1cf8.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:fb76861bfd7cfaef1cf8bb041188cce64f486183b99bad58dae256b7c94d075b
+size 122464
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17100_9a47b4c0e29b6654acfd.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17100_9a47b4c0e29b6654acfd.png
new file mode 100644
index 0000000000000000000000000000000000000000..0bc0d7da6758a45295464c41eed904b4d2350859
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17100_9a47b4c0e29b6654acfd.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9a47b4c0e29b6654acfd1ff2bd590e51895a232e2ba5278fc2bf417313da28fc
+size 378039
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17100_abfa4144fd10a0321ffa.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17100_abfa4144fd10a0321ffa.png
new file mode 100644
index 0000000000000000000000000000000000000000..697591d21dacea964a335319a896ae3b4af9ca45
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17100_abfa4144fd10a0321ffa.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:abfa4144fd10a0321ffad2808d9480bb7fd01fd4eb5677a30de36cc4adf68c9c
+size 648514
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17100_c0816bbc2ec2ea88a3b4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17100_c0816bbc2ec2ea88a3b4.png
new file mode 100644
index 0000000000000000000000000000000000000000..d9c90909b4757b7da67d04026a198ac862e23987
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17100_c0816bbc2ec2ea88a3b4.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c0816bbc2ec2ea88a3b440363eccdc495d343c5419774139075d6adce8ed35d3
+size 1785286
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17100_c269affee3aaa75c46f1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17100_c269affee3aaa75c46f1.png
new file mode 100644
index 0000000000000000000000000000000000000000..feb18623323d9229d1dcb808b3c04e73c857b35a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17100_c269affee3aaa75c46f1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c269affee3aaa75c46f1ca4cd52ca96faaab9332f286c1f62569347c65dd31d9
+size 1200729
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17300_0956961e1046c865d9c1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17300_0956961e1046c865d9c1.png
new file mode 100644
index 0000000000000000000000000000000000000000..21d64e810859006dc3212769edd263f7de6d1a69
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17300_0956961e1046c865d9c1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0956961e1046c865d9c1928f682fad1fe213a595281e0c30f5ffaf037386e720
+size 310949
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17300_31c8409fbe1e9b32eb58.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17300_31c8409fbe1e9b32eb58.png
new file mode 100644
index 0000000000000000000000000000000000000000..c386594cdae9904e2879fa334c2cff015df2fc69
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17300_31c8409fbe1e9b32eb58.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:31c8409fbe1e9b32eb58b72238ba7355ba6895733fb8f35f626749e86d3f5a9a
+size 561814
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17300_59f720d182f24778029a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17300_59f720d182f24778029a.png
new file mode 100644
index 0000000000000000000000000000000000000000..f79d04b4822518aebff95c9a1dedefe04180b340
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17300_59f720d182f24778029a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:59f720d182f24778029a10684811f60fb73860227a03eb7b574bdd8c08a405f0
+size 1441845
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17300_e7be0c6aa3c65b1880b6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17300_e7be0c6aa3c65b1880b6.png
new file mode 100644
index 0000000000000000000000000000000000000000..dc059885a4cbeda42b66f1b87ee9f2ccddb5cd68
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17300_e7be0c6aa3c65b1880b6.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e7be0c6aa3c65b1880b6a093537d48f786c43ad5a0cdb2602f071104f28c1a1f
+size 980795
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17500_185786eccdff074ea5ac.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17500_185786eccdff074ea5ac.png
new file mode 100644
index 0000000000000000000000000000000000000000..ac65279273bfb1f0327e5271943bad2ba745a977
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17500_185786eccdff074ea5ac.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:185786eccdff074ea5ac4938522bd01e6aa2de3ca1531ed5913d2cd5f4a08b03
+size 1116988
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17500_508b23b02be2d26d40d1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17500_508b23b02be2d26d40d1.png
new file mode 100644
index 0000000000000000000000000000000000000000..cf9a188efdcfc00f458c2aaf35d52bc7523998f2
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17500_508b23b02be2d26d40d1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:508b23b02be2d26d40d1af52e146f14a5a377c859f9e90101d67f2cf7470d7d4
+size 538089
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17500_79e5b72855f822a53478.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17500_79e5b72855f822a53478.png
new file mode 100644
index 0000000000000000000000000000000000000000..a96381d1698c36be09f69167d3600447c8c96ddd
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17500_79e5b72855f822a53478.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:79e5b72855f822a5347810c22c059c8d4fa0ab7f68378e7267323fcd61a4ffd7
+size 1164056
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17500_d157b3c49263b2da7976.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17500_d157b3c49263b2da7976.png
new file mode 100644
index 0000000000000000000000000000000000000000..fafbc8d85de0436b4ba2e6950dcef0e346dc7e81
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17500_d157b3c49263b2da7976.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d157b3c49263b2da7976c58dce244005456c4e26478a402c66e53c76f77abdfc
+size 550573
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17700_610fde9ff5e995f875be.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17700_610fde9ff5e995f875be.png
new file mode 100644
index 0000000000000000000000000000000000000000..5dac94fb0eb7febee62babac68b34d17cce659af
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17700_610fde9ff5e995f875be.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:610fde9ff5e995f875bed610f3d159a0d7bf42ad697901bd776f01abfd8b3d30
+size 372767
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17700_6f96e0a5f77005687f66.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17700_6f96e0a5f77005687f66.png
new file mode 100644
index 0000000000000000000000000000000000000000..84dbcde8334b283d757f44eef033506c2e5721f0
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17700_6f96e0a5f77005687f66.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6f96e0a5f77005687f6607e6622b9c1dbd18b45ef1424529b43786921f49c0e6
+size 1519860
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17700_999dc8cd26a0f1a07c31.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17700_999dc8cd26a0f1a07c31.png
new file mode 100644
index 0000000000000000000000000000000000000000..2ecd631f680a809260ce441c5d851dfad24ce2d2
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17700_999dc8cd26a0f1a07c31.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:999dc8cd26a0f1a07c31a7e40a4d19724fccd2188635aefe888d03d2eb51dd9b
+size 508599
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17700_d5ddf3d105b924ad8db6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17700_d5ddf3d105b924ad8db6.png
new file mode 100644
index 0000000000000000000000000000000000000000..7ea47c3bc2fa0c8f9ac24eebb825d394172c5038
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17700_d5ddf3d105b924ad8db6.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d5ddf3d105b924ad8db6358c42dbdaa304c8fb96fa2f2666b09398c0bfcc0536
+size 742122
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17900_08954862703018b2d431.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17900_08954862703018b2d431.png
new file mode 100644
index 0000000000000000000000000000000000000000..a1e1c4e418b6e78e33a99692343d8c31b43645a0
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17900_08954862703018b2d431.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:08954862703018b2d4315a405f4c10085ac28360e69aadabe53f4838caec5472
+size 856233
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17900_28b0b86a2aabd83c00cf.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17900_28b0b86a2aabd83c00cf.png
new file mode 100644
index 0000000000000000000000000000000000000000..fb35c091f17baf6294d8315dd50a751e3ce2e44b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17900_28b0b86a2aabd83c00cf.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:28b0b86a2aabd83c00cfe8283944490399c0dde8f74e7fafec1fd576354a2b5e
+size 419406
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17900_541fc17a9f438ec61787.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17900_541fc17a9f438ec61787.png
new file mode 100644
index 0000000000000000000000000000000000000000..fe8cddd729956b80657f9c5a31994176ce4d30b2
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17900_541fc17a9f438ec61787.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:541fc17a9f438ec617875a26f4cc6631b205f56969aadb488cee2115e04dec34
+size 470739
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17900_c865f6ce0b6c854112d4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17900_c865f6ce0b6c854112d4.png
new file mode 100644
index 0000000000000000000000000000000000000000..0ac0753ced9f55990b6692f8b96d47a684a73ef5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_17900_c865f6ce0b6c854112d4.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c865f6ce0b6c854112d4b149a615efcf8525ec0bb037d490ae3db0f8e59ad47a
+size 1264958
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18100_4baa94f9dbcb0cf8c1c6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18100_4baa94f9dbcb0cf8c1c6.png
new file mode 100644
index 0000000000000000000000000000000000000000..02e78ecab45f2df4e163331c0843d1b20b84c806
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18100_4baa94f9dbcb0cf8c1c6.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4baa94f9dbcb0cf8c1c68cb357556cb01d018d858ec04ff5df336a463945f4cc
+size 609800
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18100_801fc42f518715e30677.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18100_801fc42f518715e30677.png
new file mode 100644
index 0000000000000000000000000000000000000000..3bf3211e91a35b77ba5cfac32d2f82f50f3af009
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18100_801fc42f518715e30677.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:801fc42f518715e3067728ce2cdece2c47dcaedbf063e171225b46eafd82d524
+size 748941
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18100_c4752f2d98b5dd62c566.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18100_c4752f2d98b5dd62c566.png
new file mode 100644
index 0000000000000000000000000000000000000000..229613f1533e1864b0d212409cac6164adeaadaf
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18100_c4752f2d98b5dd62c566.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c4752f2d98b5dd62c5667f8d154fd5bc17206fc52083cc7dd6977483a337dc8c
+size 423376
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18100_d522600701db0d7f42ad.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18100_d522600701db0d7f42ad.png
new file mode 100644
index 0000000000000000000000000000000000000000..29e5a731bf4314aa7f461bc5e3d13bc97a4a172b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18100_d522600701db0d7f42ad.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d522600701db0d7f42ad693c43d62a0f51d5be0413ab989c23e657e1855b2c3a
+size 947391
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18300_0a127d3bf6a23f3168fc.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18300_0a127d3bf6a23f3168fc.png
new file mode 100644
index 0000000000000000000000000000000000000000..3f6a82fbadfcfbb07ad32d92c1402eb84b32505e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18300_0a127d3bf6a23f3168fc.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0a127d3bf6a23f3168fc6e05ae40e53e2985d6e754f3880a0e7b1112bbc15d04
+size 1099901
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18300_2f31b51fcfb7266fa7ff.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18300_2f31b51fcfb7266fa7ff.png
new file mode 100644
index 0000000000000000000000000000000000000000..9a5686335d7b162d1e422e7b5e7bd2e73446ae72
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18300_2f31b51fcfb7266fa7ff.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2f31b51fcfb7266fa7ff1185a39005e2ed6bd993891146ab180f1374f0efc227
+size 588555
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18300_62447cbe3d240f39245f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18300_62447cbe3d240f39245f.png
new file mode 100644
index 0000000000000000000000000000000000000000..143f7b263379e1f75306301c024e51fefea8c991
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18300_62447cbe3d240f39245f.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:62447cbe3d240f39245fc1cfdd6a4bd932791990dc265eb4fb5c9af7688b838f
+size 607768
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18300_709e4aa728e9344cfd9a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18300_709e4aa728e9344cfd9a.png
new file mode 100644
index 0000000000000000000000000000000000000000..d603414742dbec3998ec5c0bb9d57f62dc90832b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18300_709e4aa728e9344cfd9a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:709e4aa728e9344cfd9a611abd5c5273197114afb92f966235f9e65d2cd17918
+size 784036
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18500_20808d34b512b8308918.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18500_20808d34b512b8308918.png
new file mode 100644
index 0000000000000000000000000000000000000000..10e9732b80a2d4657e072bbdaa98e4b02c905abc
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18500_20808d34b512b8308918.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:20808d34b512b830891853eb0379eb25d1429f06e52daa8e546f427116e86106
+size 634444
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18500_6e508dddde222a4b4ee3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18500_6e508dddde222a4b4ee3.png
new file mode 100644
index 0000000000000000000000000000000000000000..c4eae9fe8de77727c873ec8a0020e862a92e5bc7
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18500_6e508dddde222a4b4ee3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6e508dddde222a4b4ee3c80acea168bec639bc2b1ecb4526bb8a0688455ef4bd
+size 462911
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18500_f9279c3927240a6daf34.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18500_f9279c3927240a6daf34.png
new file mode 100644
index 0000000000000000000000000000000000000000..4559ebec47fb1668f619e71101b99ef22324d97f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18500_f9279c3927240a6daf34.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f9279c3927240a6daf34d35f280fb79435551002481864ba82e53a5fdaa9ce56
+size 487355
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18500_f9e01a1ff02ab7fcdd03.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18500_f9e01a1ff02ab7fcdd03.png
new file mode 100644
index 0000000000000000000000000000000000000000..a12438f3a232bc60f4393d6d2c5a541ab4e3cf02
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18500_f9e01a1ff02ab7fcdd03.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f9e01a1ff02ab7fcdd03fd78a5f2f6dc339e21057dbbd68a47e43f74c9d2f82c
+size 1442359
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18700_0134bd990e470cad0abb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18700_0134bd990e470cad0abb.png
new file mode 100644
index 0000000000000000000000000000000000000000..7fb45ccd9f18ca62aabf292c0896eaaead5125f8
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18700_0134bd990e470cad0abb.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0134bd990e470cad0abb469d09907ab49e0c4901a1a4fcc6326851bfb2c74789
+size 554007
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18700_6eb4eb2dcd1a5c271a73.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18700_6eb4eb2dcd1a5c271a73.png
new file mode 100644
index 0000000000000000000000000000000000000000..752174540e8e0598a32b8ae2d081c7022b2d6a4b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18700_6eb4eb2dcd1a5c271a73.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6eb4eb2dcd1a5c271a736c132fb4fc1fde3e0ddcc6bdb26edd56bc40d0b5eea7
+size 912747
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18700_73b56012ca52d80f7956.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18700_73b56012ca52d80f7956.png
new file mode 100644
index 0000000000000000000000000000000000000000..4474303f7055d1de2120e2efbfc4c0b5d6a7c763
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18700_73b56012ca52d80f7956.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:73b56012ca52d80f795635e8be29a92a9f0ef2dbcc9509004e4367bf3787c8ea
+size 1023101
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18700_d78e6381e61e36d01b67.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18700_d78e6381e61e36d01b67.png
new file mode 100644
index 0000000000000000000000000000000000000000..588123e7002045753cc8233f68a6fbbc18652cc7
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18700_d78e6381e61e36d01b67.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d78e6381e61e36d01b67a5797173bdadd536be7e0736669eb31fcd784927047c
+size 585206
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18900_47c48746337e3eabfed3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18900_47c48746337e3eabfed3.png
new file mode 100644
index 0000000000000000000000000000000000000000..3615167ee6a7479ba7f40814ac82607ce521b714
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18900_47c48746337e3eabfed3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:47c48746337e3eabfed372ca3ec1cdaf0f032a10ba764d29a948ecbf143bf73b
+size 972619
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18900_58aa50dbb96e5daca8ec.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18900_58aa50dbb96e5daca8ec.png
new file mode 100644
index 0000000000000000000000000000000000000000..f2e61018378a9eb9913e3a48f7b3f154e0ed514f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18900_58aa50dbb96e5daca8ec.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:58aa50dbb96e5daca8ec72094d019c97a98a588d8f7b6b2ca9a9618d73c1949e
+size 1197024
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18900_596f889ecd4fb8dcc7b3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18900_596f889ecd4fb8dcc7b3.png
new file mode 100644
index 0000000000000000000000000000000000000000..d01fa268194df76dd3ed34e6ed21ff369ed82bc2
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18900_596f889ecd4fb8dcc7b3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:596f889ecd4fb8dcc7b3b586122ae5af4a0cd09b672b6f3f0742bf39c75a0bdb
+size 667432
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18900_ed35799fbf9ef07f6b29.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18900_ed35799fbf9ef07f6b29.png
new file mode 100644
index 0000000000000000000000000000000000000000..5e6e7c36bfc5662ff3853e311e1f71c91ba9dc50
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_18900_ed35799fbf9ef07f6b29.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ed35799fbf9ef07f6b29a07d5e90854f4c51ff82d1065b91c9a99f395f3ec6a7
+size 742245
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1900_4c6f76ff12da79683251.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1900_4c6f76ff12da79683251.png
new file mode 100644
index 0000000000000000000000000000000000000000..cbe5e872dc3678ff265afebf5aaf6045d670681c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1900_4c6f76ff12da79683251.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4c6f76ff12da7968325171f354b5e0eaad45fdbd9283f75b014face88aa15bdd
+size 1102053
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1900_5f833bca9afe3a594b70.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1900_5f833bca9afe3a594b70.png
new file mode 100644
index 0000000000000000000000000000000000000000..55a01f073e52e30093d3b85527c061df253bec04
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1900_5f833bca9afe3a594b70.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5f833bca9afe3a594b70cce8dca9f6218d522dfd97b5123b98397f93199e4ec8
+size 705086
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1900_90394e55ef8f0984258e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1900_90394e55ef8f0984258e.png
new file mode 100644
index 0000000000000000000000000000000000000000..d1811904c938ba477828d340d852bbf8c220d2b9
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1900_90394e55ef8f0984258e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:90394e55ef8f0984258e016564148beb50656a3c288d6b41418e5f78d9185ef2
+size 647569
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1900_c273ee6b1500a9801a78.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1900_c273ee6b1500a9801a78.png
new file mode 100644
index 0000000000000000000000000000000000000000..4ea5ab0650db216bf6fa15992855c0728f5614a0
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_1900_c273ee6b1500a9801a78.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c273ee6b1500a9801a7829684fcddc9415be8ed28b36b23dbdb6cd3f969c089e
+size 722192
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19100_0cf6beb7b755522442ad.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19100_0cf6beb7b755522442ad.png
new file mode 100644
index 0000000000000000000000000000000000000000..0e636431dd5068a8834e68e69aa71c3e075bb65c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19100_0cf6beb7b755522442ad.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0cf6beb7b755522442ad73cf570367db64814eeaff6213c7fa7bfffab217e3db
+size 660786
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19100_118d6efa6edf6c16dde6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19100_118d6efa6edf6c16dde6.png
new file mode 100644
index 0000000000000000000000000000000000000000..adb4d0c1d95948d7088b4e10f23d690e9ff0ca18
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19100_118d6efa6edf6c16dde6.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:118d6efa6edf6c16dde6eba79e02a41287edd4e55871ac3cda549da7b277a131
+size 1370310
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19100_26474df3cb8357f5fb85.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19100_26474df3cb8357f5fb85.png
new file mode 100644
index 0000000000000000000000000000000000000000..a8073c195a1da69e2a539d48984e4dbc132eaec8
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19100_26474df3cb8357f5fb85.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:26474df3cb8357f5fb8545162a48ea9d4b3eab79959675c5ab32a3e4fef47519
+size 412348
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19100_c229b3dbc72e3b1010ef.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19100_c229b3dbc72e3b1010ef.png
new file mode 100644
index 0000000000000000000000000000000000000000..a4d1e5397f8ea92a7cbff22932e202f0a2143b35
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19100_c229b3dbc72e3b1010ef.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c229b3dbc72e3b1010efb4e0a1e3c0195c6d3f72a87b8a16dc2d83f6bf790143
+size 1279818
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19300_0b39d10e5742a921c2a6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19300_0b39d10e5742a921c2a6.png
new file mode 100644
index 0000000000000000000000000000000000000000..917207546312619a0af0b86c63c6d801065bad15
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19300_0b39d10e5742a921c2a6.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0b39d10e5742a921c2a698c2c1ae6e818b37ece259fdd5cb6b9e9699823c3883
+size 1001871
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19300_94b13fb13907b7ef90f8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19300_94b13fb13907b7ef90f8.png
new file mode 100644
index 0000000000000000000000000000000000000000..943c9fbf55528dfd9d7c09b027d96a0bfbf3a3e5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19300_94b13fb13907b7ef90f8.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:94b13fb13907b7ef90f88044d8cc4f121349914925726b9d192a264e6eb75669
+size 536277
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19300_983372697e864f242dbc.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19300_983372697e864f242dbc.png
new file mode 100644
index 0000000000000000000000000000000000000000..2c56fc958f91340cf33d7c7b01d2f1492e09d474
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19300_983372697e864f242dbc.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:983372697e864f242dbc8f3eff75be0a4c1da33e94ddb637b893e5926c35a934
+size 625130
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19300_e7e69dd7f60577c70428.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19300_e7e69dd7f60577c70428.png
new file mode 100644
index 0000000000000000000000000000000000000000..8e2f70ddbb6d2711c1843a5518afdf735a8621e5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19300_e7e69dd7f60577c70428.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e7e69dd7f60577c704285cd0452410b897778aa1556b58aa42d75aa19ddacaa7
+size 460380
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19500_01e0d9f82e90f500a5db.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19500_01e0d9f82e90f500a5db.png
new file mode 100644
index 0000000000000000000000000000000000000000..ce2bb509aeb1abebb4fd201c21ce20770b2c297c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19500_01e0d9f82e90f500a5db.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:01e0d9f82e90f500a5dbbbe0da52495ef1d8b4cee285eadd8e40555f779a979e
+size 490576
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19500_80c45eed46c1169100a6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19500_80c45eed46c1169100a6.png
new file mode 100644
index 0000000000000000000000000000000000000000..b44ab9d608d749e9335e806da4bb2687c809e7a6
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19500_80c45eed46c1169100a6.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:80c45eed46c1169100a6346ac23d64d17114aa785ae62eccfed586163d961de9
+size 591512
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19500_a396ad8aad693ca9a9cd.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19500_a396ad8aad693ca9a9cd.png
new file mode 100644
index 0000000000000000000000000000000000000000..bc6916bead2a200682888f8ad258acba0c5e306a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19500_a396ad8aad693ca9a9cd.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a396ad8aad693ca9a9cddc237742b07178d5dc35556de92730f102d94a363bf1
+size 370558
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19500_b9ff1f12cf798fe9761e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19500_b9ff1f12cf798fe9761e.png
new file mode 100644
index 0000000000000000000000000000000000000000..6c590bbcea50863671bbd7f6f7c0626696ba4e18
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19500_b9ff1f12cf798fe9761e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b9ff1f12cf798fe9761e492436c0df9607d5aa3e778081474f0d7eae42893f7f
+size 621013
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19700_169a6b80394cf7e3174c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19700_169a6b80394cf7e3174c.png
new file mode 100644
index 0000000000000000000000000000000000000000..a855b4c6f18ee32bd1556a525ff18d822c4ebb44
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19700_169a6b80394cf7e3174c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:169a6b80394cf7e3174c44ccab7c4e5755eb3f05f94e5d7fffe503cb625ebe33
+size 414491
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19700_24ae4df9da2d3bd5ff27.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19700_24ae4df9da2d3bd5ff27.png
new file mode 100644
index 0000000000000000000000000000000000000000..3f072d924a65c2f13ce74b42e3076b0c7121055b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19700_24ae4df9da2d3bd5ff27.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:24ae4df9da2d3bd5ff2775874edfb1bd29ed9ce2e83fe9c3eab0514274b9325f
+size 937062
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19700_a1a70e7befd5f21fb4a1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19700_a1a70e7befd5f21fb4a1.png
new file mode 100644
index 0000000000000000000000000000000000000000..3107bd93d90e609f9d2c9ad95bc8b53b5db7cb80
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19700_a1a70e7befd5f21fb4a1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a1a70e7befd5f21fb4a1202c392d6d40ae3b0e36432349f94e0f827cad4c5cdd
+size 425789
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19700_b73f14fccc92691a82a5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19700_b73f14fccc92691a82a5.png
new file mode 100644
index 0000000000000000000000000000000000000000..cff429afb7fa8b6029b92ad31ecd30fda8ba690b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19700_b73f14fccc92691a82a5.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b73f14fccc92691a82a56d212a34f16a92c475b81f57b9316937684589f020d3
+size 515007
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19900_5647cfb1bebfa3d7275e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19900_5647cfb1bebfa3d7275e.png
new file mode 100644
index 0000000000000000000000000000000000000000..3ed184e419a0c5d01e598fff3e670f9567d687f5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19900_5647cfb1bebfa3d7275e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5647cfb1bebfa3d7275e2f4c0513a939f9f041ba34e3fb67d376447c157d2300
+size 795429
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19900_592cd324f731f1b28c69.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19900_592cd324f731f1b28c69.png
new file mode 100644
index 0000000000000000000000000000000000000000..29e00c9b85f213d543be5bca8dbc4e0c0e3b403c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19900_592cd324f731f1b28c69.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:592cd324f731f1b28c696ec50fda9a8018ba0880b884f505a29ea254a957d7c7
+size 760357
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19900_5ef854c89f6648798d2b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19900_5ef854c89f6648798d2b.png
new file mode 100644
index 0000000000000000000000000000000000000000..7a03dd970a8c6e7ed5cdfb5d70c261e31c146b20
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19900_5ef854c89f6648798d2b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5ef854c89f6648798d2bfc4451f8adc1f5f83987d54e8cfc8b290eb1c10f721d
+size 698725
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19900_643c961f75b9b0948fa0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19900_643c961f75b9b0948fa0.png
new file mode 100644
index 0000000000000000000000000000000000000000..9e01caad64430b4c7addcfbf987ede5411fbe339
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_19900_643c961f75b9b0948fa0.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:643c961f75b9b0948fa0c87105526289c9b3b6e2c280d52315ab2a9c9d1c537a
+size 976741
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20100_13f3d84064e83ea625bb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20100_13f3d84064e83ea625bb.png
new file mode 100644
index 0000000000000000000000000000000000000000..efe0b00c4a38ced877a93d5a92b883a4cfc6c34d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20100_13f3d84064e83ea625bb.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:13f3d84064e83ea625bba487f81a5d926a30f3b518c835ad7c7c33b933e7ac4b
+size 511476
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20100_217cbd4ceed10f8f7cdc.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20100_217cbd4ceed10f8f7cdc.png
new file mode 100644
index 0000000000000000000000000000000000000000..ebab75aa30034c978b32c199a0588d1047cd71da
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20100_217cbd4ceed10f8f7cdc.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:217cbd4ceed10f8f7cdcf4b1c944fc0194f21e03404d598b00cdc27cc11d7593
+size 792159
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20100_2e2aa22248dbf6e5f26f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20100_2e2aa22248dbf6e5f26f.png
new file mode 100644
index 0000000000000000000000000000000000000000..e324b7c84bcc8a9e0c50b8b57d497258087cc843
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20100_2e2aa22248dbf6e5f26f.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2e2aa22248dbf6e5f26ff31718bfddeb6114254f8f69381792804d03897d9cb8
+size 691010
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20100_c05c8026a3db4ab95371.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20100_c05c8026a3db4ab95371.png
new file mode 100644
index 0000000000000000000000000000000000000000..67ce57e13ce53f3c6abb6874f198cee04dcc8e95
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20100_c05c8026a3db4ab95371.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c05c8026a3db4ab95371e5213a0f9e236c31a06a561690e6e26af4d1cd91104b
+size 962903
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20300_2fcd6d9b1bf6fa716b32.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20300_2fcd6d9b1bf6fa716b32.png
new file mode 100644
index 0000000000000000000000000000000000000000..52974659fbffeedc43f69a1570da4ba2991ef3df
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20300_2fcd6d9b1bf6fa716b32.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2fcd6d9b1bf6fa716b32209a0e5b43db426068700f001ecfe8156ee8bc6f4d42
+size 1036005
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20300_40ffbf5b21eee70c62bc.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20300_40ffbf5b21eee70c62bc.png
new file mode 100644
index 0000000000000000000000000000000000000000..ec666ae3f14b99ef8ff1a98ee947a56cafb3a535
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20300_40ffbf5b21eee70c62bc.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:40ffbf5b21eee70c62bc42eaf5206631309dac7b53bf927c9acd8731208bbb14
+size 1112590
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20300_b04fc2addb57226f5f98.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20300_b04fc2addb57226f5f98.png
new file mode 100644
index 0000000000000000000000000000000000000000..5172e361bb8acb456479a21fd3566296710b7e47
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20300_b04fc2addb57226f5f98.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b04fc2addb57226f5f98a8a9a705b2295c7910c190787ff66b36ffd9f7743575
+size 685879
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20300_c6b7cb85f250691d87f4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20300_c6b7cb85f250691d87f4.png
new file mode 100644
index 0000000000000000000000000000000000000000..3b185a0b1e32deb278491d54d5be5cdbb9232bd5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20300_c6b7cb85f250691d87f4.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c6b7cb85f250691d87f44dacdf9fe91bdc3c386b7f800023900996f3e8bf593f
+size 1129776
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20500_108f1562de6420fe226c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20500_108f1562de6420fe226c.png
new file mode 100644
index 0000000000000000000000000000000000000000..62f97a94770b1de499d191bbc5832c0c35471641
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20500_108f1562de6420fe226c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:108f1562de6420fe226c255f10dd54a5ff96ef80e7690e1f4d58b9909b080e7b
+size 555050
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20500_442b8112f8c34f7c2bd9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20500_442b8112f8c34f7c2bd9.png
new file mode 100644
index 0000000000000000000000000000000000000000..13c4eee7d36787d989fd948bb57d3b2ff46263a9
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20500_442b8112f8c34f7c2bd9.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:442b8112f8c34f7c2bd95d2bf106f98914fda9dfa92779cd459a9a02b8f14089
+size 711455
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20500_934bcd4fc58ff1132dc8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20500_934bcd4fc58ff1132dc8.png
new file mode 100644
index 0000000000000000000000000000000000000000..48cfddb9bf9c5bb2e679b36b45f6d09ea9dfe1ea
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20500_934bcd4fc58ff1132dc8.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:934bcd4fc58ff1132dc87a7c980d8035faefda04d591a096f1b3ace529b12ac5
+size 825367
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20500_df382640f809e9bddd55.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20500_df382640f809e9bddd55.png
new file mode 100644
index 0000000000000000000000000000000000000000..5940d1d62c6f99a3f74db437afbfd9c86289b2cb
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20500_df382640f809e9bddd55.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:df382640f809e9bddd55a6728d6b39b6e048751c58b39ba7876e66f82e792837
+size 481904
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20700_106377afc87f6dc6aa05.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20700_106377afc87f6dc6aa05.png
new file mode 100644
index 0000000000000000000000000000000000000000..ddf7902c342269451f19596810568a4313018591
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20700_106377afc87f6dc6aa05.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:106377afc87f6dc6aa057ac7168a7df9bcf4e497f3915cc2090122512a56b715
+size 799464
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20700_116e0afcbb9a5a631e67.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20700_116e0afcbb9a5a631e67.png
new file mode 100644
index 0000000000000000000000000000000000000000..bb19ae535d73b53ecf737d8105b6ff0431de9854
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20700_116e0afcbb9a5a631e67.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:116e0afcbb9a5a631e6790493287cd01f41627e8b7faed4e41008b37149a8550
+size 765870
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20700_83c2af7af812a18b5067.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20700_83c2af7af812a18b5067.png
new file mode 100644
index 0000000000000000000000000000000000000000..b899ac2d81f438d9ef99b1c60e5aad064f14c02e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20700_83c2af7af812a18b5067.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:83c2af7af812a18b506726c4585dc029828dace4607e94ebec1cbe3343b2d1d5
+size 737977
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20700_9da46d539bf5cd4d7f7c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20700_9da46d539bf5cd4d7f7c.png
new file mode 100644
index 0000000000000000000000000000000000000000..5f0e2970a213a3e0ba047622c616cfe8f0297073
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20700_9da46d539bf5cd4d7f7c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9da46d539bf5cd4d7f7ce4af355616408b927db1f0df22e65730e567d3ca2f58
+size 744329
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20900_6aafd848e514e2ae2c2f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20900_6aafd848e514e2ae2c2f.png
new file mode 100644
index 0000000000000000000000000000000000000000..f28c7d89f358a8ee771eccc0a77c7eceb309e7ea
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20900_6aafd848e514e2ae2c2f.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6aafd848e514e2ae2c2fc44c37a578448f71c988aec4ef061a5a8d5d4bc5d221
+size 817212
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20900_9bd3368091564f5f09d3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20900_9bd3368091564f5f09d3.png
new file mode 100644
index 0000000000000000000000000000000000000000..a637fbd6d947d46b6170ba4a0308a03fd4cf82ae
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20900_9bd3368091564f5f09d3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9bd3368091564f5f09d328a521fd0434386dc5cf23380627845bef822c9a0f2b
+size 970560
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20900_9d7694e62dbea1f3dd6e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20900_9d7694e62dbea1f3dd6e.png
new file mode 100644
index 0000000000000000000000000000000000000000..06d1b94d2a712612f17219f4f0759f1f8659ce81
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20900_9d7694e62dbea1f3dd6e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9d7694e62dbea1f3dd6e256bab780607a71d0456d631d2a29d28c7cacf5ce968
+size 619413
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20900_e851452de9dec0f806f6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20900_e851452de9dec0f806f6.png
new file mode 100644
index 0000000000000000000000000000000000000000..16ac5664c21c48a17554080c86ecec98b7eeb74b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_20900_e851452de9dec0f806f6.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e851452de9dec0f806f6878b8dbf7eae7436843e6154ba522f0c1a97c68fb8b2
+size 638767
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2100_175a60f327f5856061c1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2100_175a60f327f5856061c1.png
new file mode 100644
index 0000000000000000000000000000000000000000..96d0f463bda6a8fde67633df7c47fe8cbc2c9ae6
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2100_175a60f327f5856061c1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:175a60f327f5856061c1459bbb1647f1dc967ac6aff1c7301b317e8c8f46a19b
+size 902255
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2100_3837d34d430742e17c96.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2100_3837d34d430742e17c96.png
new file mode 100644
index 0000000000000000000000000000000000000000..e364ad7e9e5a4cd7b475fa032bce65dacad321ba
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2100_3837d34d430742e17c96.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3837d34d430742e17c96ea10235bd1b1c23598ee83e546212183af8477a5352d
+size 336708
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2100_6c72472ff4d15940386e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2100_6c72472ff4d15940386e.png
new file mode 100644
index 0000000000000000000000000000000000000000..a9028042e13f5126ad51b3afed2297d025eea45e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2100_6c72472ff4d15940386e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6c72472ff4d15940386e244898e1c9de7590c2313ea854cb4274adb6e1d0b267
+size 901155
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2100_bad52616ee39e65dfb5d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2100_bad52616ee39e65dfb5d.png
new file mode 100644
index 0000000000000000000000000000000000000000..ae248f97c8f5b5fd7f1f9b9b8c326a36a6f72851
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2100_bad52616ee39e65dfb5d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:bad52616ee39e65dfb5df3bdc6ba0af35ed60ae67a5b860199149f6e52110ff8
+size 644304
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21100_23848c87f922bdf6be1d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21100_23848c87f922bdf6be1d.png
new file mode 100644
index 0000000000000000000000000000000000000000..4759fe91a0d3eae164550f518c828d5f9b645143
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21100_23848c87f922bdf6be1d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:23848c87f922bdf6be1d1dab206c9167d48839cfc8eb19b68a8e58d72a24cd30
+size 594604
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21100_2851c9c4aadf105d581f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21100_2851c9c4aadf105d581f.png
new file mode 100644
index 0000000000000000000000000000000000000000..3d45296eccfbaa32512af057f8c1b01b9cd7779f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21100_2851c9c4aadf105d581f.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2851c9c4aadf105d581ff1694f83ad3f797c53079addf0a00a61c7541d5786f3
+size 1270949
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21100_83f96b1ad2ed6a6c9593.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21100_83f96b1ad2ed6a6c9593.png
new file mode 100644
index 0000000000000000000000000000000000000000..b40a0f7f60648ad17c341252436a2a6f6ffbb2b1
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21100_83f96b1ad2ed6a6c9593.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:83f96b1ad2ed6a6c9593a57ca8effdc41fe29fae10d29b8a73652f1aa89538e0
+size 648874
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21100_ab8acb247832b3cdc713.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21100_ab8acb247832b3cdc713.png
new file mode 100644
index 0000000000000000000000000000000000000000..28ac9042427ad142085a6a813bd0ca0d068e67eb
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21100_ab8acb247832b3cdc713.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ab8acb247832b3cdc7131a2113b04cf1c62a31f03ae4986a604a136ca02bd8ca
+size 1039811
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21300_28a0a199e4db0dfbfcc8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21300_28a0a199e4db0dfbfcc8.png
new file mode 100644
index 0000000000000000000000000000000000000000..72d0824d1b35d5fabad1c115863588ca3692fb23
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21300_28a0a199e4db0dfbfcc8.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:28a0a199e4db0dfbfcc8676140a259991cc497027454ffe32debe48153eaf80c
+size 1206420
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21300_5c83a9225021538c0469.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21300_5c83a9225021538c0469.png
new file mode 100644
index 0000000000000000000000000000000000000000..6560aa7d1be2c336625b2bec0604e5700706f59c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21300_5c83a9225021538c0469.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5c83a9225021538c0469dc1ac27d90ed007aa164e53630d600e7aaa2755e5015
+size 1125954
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21300_9a757455f2067a6b058a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21300_9a757455f2067a6b058a.png
new file mode 100644
index 0000000000000000000000000000000000000000..fc8dca5c5d64bc9cccde319cd83134d73da642fe
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21300_9a757455f2067a6b058a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9a757455f2067a6b058a1907701d4020d362182b794dbcd89e03812a7c20cb43
+size 722070
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21300_a48fbc6a8206b3bc788b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21300_a48fbc6a8206b3bc788b.png
new file mode 100644
index 0000000000000000000000000000000000000000..bab622986a9d584e66cf0ab243c7dbca64c09b84
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21300_a48fbc6a8206b3bc788b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a48fbc6a8206b3bc788bec509e7c186395a095c223f81b54a3d3b086c4f0f3ca
+size 479384
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21500_274a715097411d02fa0e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21500_274a715097411d02fa0e.png
new file mode 100644
index 0000000000000000000000000000000000000000..f83de8e1c783ab9ef7e050d8088dddbe0bf0eb5e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21500_274a715097411d02fa0e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:274a715097411d02fa0eae594b202a567124a4ed4942ce87630ad7a257dd77b3
+size 920973
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21500_39f3d14bdbfb02e86d0e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21500_39f3d14bdbfb02e86d0e.png
new file mode 100644
index 0000000000000000000000000000000000000000..63146d3d3b5e8ffb37d6a208f59be8575bb42d0f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21500_39f3d14bdbfb02e86d0e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:39f3d14bdbfb02e86d0e23d809e10e8c23e37107e583007e23050a7b69bb846d
+size 222329
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21500_442787d5351249f1597c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21500_442787d5351249f1597c.png
new file mode 100644
index 0000000000000000000000000000000000000000..3b7aec1ce94284867a3384b7ad66e121ed82a631
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21500_442787d5351249f1597c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:442787d5351249f1597c47731f7a25911d98f2fc254f3b91c2fccbc04455a5f1
+size 354328
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21500_5c74b3028ff6bf26952f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21500_5c74b3028ff6bf26952f.png
new file mode 100644
index 0000000000000000000000000000000000000000..dfd4fd38d64c9b80b2c4ab83b759dcb7c9550198
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21500_5c74b3028ff6bf26952f.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5c74b3028ff6bf26952f19efafb1aaa4ad48ca5b5a6559990c64931e28f05108
+size 830401
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21700_200f60fa07e3a93137c1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21700_200f60fa07e3a93137c1.png
new file mode 100644
index 0000000000000000000000000000000000000000..66cb49a648da7d7d81e107981a160a2861b5524a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21700_200f60fa07e3a93137c1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:200f60fa07e3a93137c168761ab1401f8fb0487fe5522b05122491087aa85df7
+size 1320190
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21700_b0c015ab5bc47e95a2a6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21700_b0c015ab5bc47e95a2a6.png
new file mode 100644
index 0000000000000000000000000000000000000000..e9f85973ca697885e32f04a5b5a3587e780544d9
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21700_b0c015ab5bc47e95a2a6.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b0c015ab5bc47e95a2a67bf66aaaa2dd77db518d444860dfe6aa1a0c9683d0b7
+size 509673
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21700_cf420e77b1a15f5471dc.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21700_cf420e77b1a15f5471dc.png
new file mode 100644
index 0000000000000000000000000000000000000000..a4c8d83e455b7c7f2c8afbeaafeded6a1c7ac2f5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21700_cf420e77b1a15f5471dc.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:cf420e77b1a15f5471dcb7f773f04973136debaa1228464c9a07f390d1a0576f
+size 160235
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21700_eb970913a1034ef3acf8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21700_eb970913a1034ef3acf8.png
new file mode 100644
index 0000000000000000000000000000000000000000..6ef6c48b8c31d1218fad47a732a60b6779c0a889
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21700_eb970913a1034ef3acf8.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:eb970913a1034ef3acf8168de293f993a46637bf3caa0e70d15fe5c42b2f151c
+size 1107050
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21900_11e7478f5886099da294.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21900_11e7478f5886099da294.png
new file mode 100644
index 0000000000000000000000000000000000000000..e377129bbc73c63ce93a078dc38c3f211d1eafde
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21900_11e7478f5886099da294.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:11e7478f5886099da2941fa53e0f6b7cbbf0452e9c2a81c4a35c903dc32ce121
+size 1768782
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21900_2b90fc3780276adb2847.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21900_2b90fc3780276adb2847.png
new file mode 100644
index 0000000000000000000000000000000000000000..9c7f964fb5c13f4899d16c13a92ef499a6b92c65
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21900_2b90fc3780276adb2847.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2b90fc3780276adb28470842e5f40f00fa205464c518849ff63b3c0c9917cd46
+size 1212003
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21900_3cfd6c1629f2e291f37d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21900_3cfd6c1629f2e291f37d.png
new file mode 100644
index 0000000000000000000000000000000000000000..b9b18d3e8510307005d46758ad1c19607399d509
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21900_3cfd6c1629f2e291f37d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3cfd6c1629f2e291f37d61c640e0052ca74f105480f72f3fd69534283e3bfb0a
+size 582352
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21900_bfc6ff6a4c70cad6d413.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21900_bfc6ff6a4c70cad6d413.png
new file mode 100644
index 0000000000000000000000000000000000000000..9cb11bd0254cc98cae8def28bc520ddeb4c1f7e0
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_21900_bfc6ff6a4c70cad6d413.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:bfc6ff6a4c70cad6d4131d473c2179b091cbaf494dfd21ec41b84c783a61bc9a
+size 559864
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22100_4f4a217899b50a6ea32a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22100_4f4a217899b50a6ea32a.png
new file mode 100644
index 0000000000000000000000000000000000000000..6a7c4b0a0161764fffeb7b463e6ead6a3f7923a0
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22100_4f4a217899b50a6ea32a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4f4a217899b50a6ea32a8f56172779327815bd02fc836288235a6f08b247e3e9
+size 506699
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22100_52f8c64d3787f20c5d69.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22100_52f8c64d3787f20c5d69.png
new file mode 100644
index 0000000000000000000000000000000000000000..a5c41ea0101e1b5aa63cf050b0d0750982e754fd
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22100_52f8c64d3787f20c5d69.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:52f8c64d3787f20c5d69e135a090ab770fca7f6f68939d96167a3ef3ff3a74fe
+size 1200264
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22100_d2226130b2975c68d62f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22100_d2226130b2975c68d62f.png
new file mode 100644
index 0000000000000000000000000000000000000000..18d66a62f0f8f235390c0b002132a336a165792f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22100_d2226130b2975c68d62f.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d2226130b2975c68d62f57a4cb2e1b7c82d73300b3908d5e79661ec4804ade7b
+size 493744
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22100_ebdbc60fbf67bfcf0297.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22100_ebdbc60fbf67bfcf0297.png
new file mode 100644
index 0000000000000000000000000000000000000000..dd35fe93b0c3ce72a60449b1aaacf22c385adc70
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22100_ebdbc60fbf67bfcf0297.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ebdbc60fbf67bfcf02978be1c72df7f21eb8399b06880f4069a9b57c47e4ce51
+size 940127
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22300_0d5df32c1d32c9b14f9f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22300_0d5df32c1d32c9b14f9f.png
new file mode 100644
index 0000000000000000000000000000000000000000..4e5fb6c4b563f4ad2b4b88680de7bf8d1d0bcc42
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22300_0d5df32c1d32c9b14f9f.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0d5df32c1d32c9b14f9fbee9d707f737781197ed95994ad08f1dc3de189a2ff8
+size 998915
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22300_299213873ed0fcf8571b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22300_299213873ed0fcf8571b.png
new file mode 100644
index 0000000000000000000000000000000000000000..e2f8f43daf649a1d22408163aa5405c6171718f0
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22300_299213873ed0fcf8571b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:299213873ed0fcf8571b16fda3f158c1be7184978d33530e6e965f0607d32d97
+size 930865
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22300_60f5889333cc4cbcc334.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22300_60f5889333cc4cbcc334.png
new file mode 100644
index 0000000000000000000000000000000000000000..8dcb97963eb151d96d9b43c52bd878925e1118be
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22300_60f5889333cc4cbcc334.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:60f5889333cc4cbcc33426e9e9752c3683691987376dda4d89c6a155d6fd4234
+size 462223
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22300_d6b3354c4e3020554f38.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22300_d6b3354c4e3020554f38.png
new file mode 100644
index 0000000000000000000000000000000000000000..43e7930299280580b23ede1f03f1c0eedcfa0406
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22300_d6b3354c4e3020554f38.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d6b3354c4e3020554f38954b38875c5f133bd002803f1dd3765f62f287861764
+size 437203
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22500_2f91565ffd290d9771fb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22500_2f91565ffd290d9771fb.png
new file mode 100644
index 0000000000000000000000000000000000000000..d6df4b3f274a5c4aecd72b6ad01f643ad39d3623
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22500_2f91565ffd290d9771fb.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2f91565ffd290d9771fbfcff642d9ddbb074c0a3b9ab875a252ba5baad007c7c
+size 813250
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22500_4b5ec017083e2989fe78.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22500_4b5ec017083e2989fe78.png
new file mode 100644
index 0000000000000000000000000000000000000000..9c08c5ec5440854ea7f850b65ef6388b121fbac5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22500_4b5ec017083e2989fe78.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4b5ec017083e2989fe785895c691d52c4cf51b65485b741cbc9b1ca7d2e773e2
+size 498305
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22500_84ef4f4ac667236218c7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22500_84ef4f4ac667236218c7.png
new file mode 100644
index 0000000000000000000000000000000000000000..41d3630367dabce51e1492e5b553b646ec6fb7be
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22500_84ef4f4ac667236218c7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:84ef4f4ac667236218c702f64857b76391816397e44d826b3f03759d3a89e633
+size 349331
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22500_8d1640b2d978e962d608.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22500_8d1640b2d978e962d608.png
new file mode 100644
index 0000000000000000000000000000000000000000..bdea1ad5e507318266286ef28b23b156741df202
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22500_8d1640b2d978e962d608.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8d1640b2d978e962d6082b35c42c84244389270742199588d266bdb192278163
+size 1164685
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22700_2f9d56a1fa4d75b71aaa.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22700_2f9d56a1fa4d75b71aaa.png
new file mode 100644
index 0000000000000000000000000000000000000000..e88df540a2a0383bf73c066eb082a26b27fbd91c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22700_2f9d56a1fa4d75b71aaa.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2f9d56a1fa4d75b71aaac8febcf59d8cd8a36ea7541327b353981cb117810a51
+size 901835
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22700_3765f0950011f2fdeea6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22700_3765f0950011f2fdeea6.png
new file mode 100644
index 0000000000000000000000000000000000000000..88fa672c329a6b9fe05cd1ebf41221df36212714
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22700_3765f0950011f2fdeea6.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3765f0950011f2fdeea6f769df7bc85514f05e3e5007e63ec842623906416d9a
+size 624899
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22700_6748db406cea9cef45bd.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22700_6748db406cea9cef45bd.png
new file mode 100644
index 0000000000000000000000000000000000000000..ca7ecd90fbe7bf13d15b7ad2f10b0404f7d0a0f9
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22700_6748db406cea9cef45bd.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6748db406cea9cef45bd8a1640268a091ef44c11cb383f9278bcfd901419b632
+size 399070
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22700_f441058efd877ee156ed.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22700_f441058efd877ee156ed.png
new file mode 100644
index 0000000000000000000000000000000000000000..bb73eb559bbba0c98a3e42ae803cca92395e7bce
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22700_f441058efd877ee156ed.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f441058efd877ee156eda0347eead1b175bc08f2c6a743698f6caa3ec4971094
+size 1243075
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22900_7bb856cb30e92d68d040.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22900_7bb856cb30e92d68d040.png
new file mode 100644
index 0000000000000000000000000000000000000000..aa301c1c058b27d48b8990fff99fdec0643acf41
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22900_7bb856cb30e92d68d040.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7bb856cb30e92d68d04022bf113072e95449ed385a0da7f55ba340108997a6cd
+size 553091
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22900_90571b9cf688fa590b44.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22900_90571b9cf688fa590b44.png
new file mode 100644
index 0000000000000000000000000000000000000000..612fb1b273712ba7e15365cb7805899c6c16526e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22900_90571b9cf688fa590b44.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:90571b9cf688fa590b449b474d323da6f47d0408cd60d1ba1d5f383bffc8065e
+size 658384
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22900_cccc79a5b729b15f97a8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22900_cccc79a5b729b15f97a8.png
new file mode 100644
index 0000000000000000000000000000000000000000..37164e51f1784244c8d201677fa68a948c90531e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22900_cccc79a5b729b15f97a8.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:cccc79a5b729b15f97a837ae9d58739db7e05230284e86a9e54e03a19b56d56c
+size 866688
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22900_fc33ad74acf4a98a30ee.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22900_fc33ad74acf4a98a30ee.png
new file mode 100644
index 0000000000000000000000000000000000000000..44cb9ce36bd26ebb1d739a5a88f29c2d0c262b88
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_22900_fc33ad74acf4a98a30ee.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:fc33ad74acf4a98a30eeff28c86c6060fcc6f85caa4bca7aa8963d9f606a99ba
+size 523415
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2300_5bde6bca6f493c652b7c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2300_5bde6bca6f493c652b7c.png
new file mode 100644
index 0000000000000000000000000000000000000000..53d18abf787d863a71c17418b5c0b775b1dee043
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2300_5bde6bca6f493c652b7c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5bde6bca6f493c652b7c486ab564be7b333a6eb1e033565de5a8c907a2f805a4
+size 220904
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2300_68d6f06168daee646685.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2300_68d6f06168daee646685.png
new file mode 100644
index 0000000000000000000000000000000000000000..93a0c4d991f1fe287d199677be3362d7d6a78d2d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2300_68d6f06168daee646685.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:68d6f06168daee646685992c282f9f39c2124a138358fc16ebe3403a70f527c9
+size 653043
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2300_6d4e005e39a8f86a0138.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2300_6d4e005e39a8f86a0138.png
new file mode 100644
index 0000000000000000000000000000000000000000..33ec6c88b76e0876028ad3eb9987716c529035d8
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2300_6d4e005e39a8f86a0138.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6d4e005e39a8f86a0138f4244b4184eb1cb3ede1d2446b21073fff0f09b33aba
+size 974217
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2300_df4ada286dde3e76bda5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2300_df4ada286dde3e76bda5.png
new file mode 100644
index 0000000000000000000000000000000000000000..cabb7594e38c0eb48df9777b8faffd2e40bbdb52
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2300_df4ada286dde3e76bda5.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:df4ada286dde3e76bda567432e77df81e859d2da0b896801195ae0732bca81b2
+size 739252
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23100_19b03b47743b6fd36e2c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23100_19b03b47743b6fd36e2c.png
new file mode 100644
index 0000000000000000000000000000000000000000..aadc98261587cb8cc462ce9f79365663485c4761
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23100_19b03b47743b6fd36e2c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:19b03b47743b6fd36e2c07239482d52d6e9ea387fc9434656d39099d11d10f93
+size 565127
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23100_ad43cc87106d4b7d4be1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23100_ad43cc87106d4b7d4be1.png
new file mode 100644
index 0000000000000000000000000000000000000000..8252ee17ff6f22de18e3b3c6ea2fc7843de5e56f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23100_ad43cc87106d4b7d4be1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ad43cc87106d4b7d4be188a13a972765ea02d5d940652060e7cedd9ffee9bd8f
+size 1524291
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23100_e64fb3b00e0b0aa8801d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23100_e64fb3b00e0b0aa8801d.png
new file mode 100644
index 0000000000000000000000000000000000000000..4d146d8fa8c8b6055df380ed6bbf3b54d4e80589
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23100_e64fb3b00e0b0aa8801d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e64fb3b00e0b0aa8801da7b6c102cbd8dbfb8570d3bd2c303e0f6e8d40f20ce5
+size 812454
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23100_ebe47bd4fff851310dde.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23100_ebe47bd4fff851310dde.png
new file mode 100644
index 0000000000000000000000000000000000000000..f2f1bfc7283261e3b96278103dccc2ad4de6afc8
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23100_ebe47bd4fff851310dde.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ebe47bd4fff851310dde3ccc7229f1a6800f5ac21571f3f136ffc9c11e397568
+size 814929
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23300_52a95cd1e3e5177ea135.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23300_52a95cd1e3e5177ea135.png
new file mode 100644
index 0000000000000000000000000000000000000000..243f4602fa5cd8c0a0de063ad328ad085fa1f4be
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23300_52a95cd1e3e5177ea135.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:52a95cd1e3e5177ea135f412038c6aa54011b9fd1d74d29774c8beea5e55535b
+size 1668584
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23300_7a5d5bffd388015e98bd.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23300_7a5d5bffd388015e98bd.png
new file mode 100644
index 0000000000000000000000000000000000000000..1ca676d03e2f2cafce0897ff7d61fbdbbcec5dc1
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23300_7a5d5bffd388015e98bd.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7a5d5bffd388015e98bd04a263b086c7381b77228dedea7fd84282aada2682c5
+size 408398
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23300_def8ada682ff30f893fd.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23300_def8ada682ff30f893fd.png
new file mode 100644
index 0000000000000000000000000000000000000000..9e2803e76bf5ec481c4abb6daada9917d61b94da
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23300_def8ada682ff30f893fd.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:def8ada682ff30f893fde37c9467ddf2d1ede9803c4e82351ebfa33790520dca
+size 295154
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23300_f6aab1dec7eda1381b7d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23300_f6aab1dec7eda1381b7d.png
new file mode 100644
index 0000000000000000000000000000000000000000..2af93d74bd982f3866044b96ca7c941b9caf4ad8
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23300_f6aab1dec7eda1381b7d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f6aab1dec7eda1381b7d0aa57215624b94bd2d187dc6741ef542be49cdf4626e
+size 747590
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23500_38aa560536f1d753a72b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23500_38aa560536f1d753a72b.png
new file mode 100644
index 0000000000000000000000000000000000000000..62d73c6dbe89cb09b78da74f2d3731d7348f25bd
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23500_38aa560536f1d753a72b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:38aa560536f1d753a72b56c732f2d63557cb902eb7a96e649e2c79937cc632c1
+size 909066
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23500_760448afae46f5ee8f19.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23500_760448afae46f5ee8f19.png
new file mode 100644
index 0000000000000000000000000000000000000000..871680060d36184cb7a2e536b995fed6d3270ae5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23500_760448afae46f5ee8f19.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:760448afae46f5ee8f194865844c9490760a965d342d281208ceaabda0f88984
+size 520408
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23500_caf5cd1cc3d3d5f9e3c8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23500_caf5cd1cc3d3d5f9e3c8.png
new file mode 100644
index 0000000000000000000000000000000000000000..ae90c6d421edce0e00c01b777bc5bde08c29b027
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23500_caf5cd1cc3d3d5f9e3c8.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:caf5cd1cc3d3d5f9e3c8732813a0ddcd4e3c671b8bfee58a3916b1e699f05a82
+size 639450
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23500_fee7094bb8dfba24663e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23500_fee7094bb8dfba24663e.png
new file mode 100644
index 0000000000000000000000000000000000000000..4bfbb785056cd63455de5a65c385488ce62861bd
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23500_fee7094bb8dfba24663e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:fee7094bb8dfba24663e616b4e3769557ac2420d196bb4d9032f324b89ffac67
+size 1023485
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23700_272357b33b4a4ba796c2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23700_272357b33b4a4ba796c2.png
new file mode 100644
index 0000000000000000000000000000000000000000..808d462e031f6e816fbb59c51fd27fb1222cda22
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23700_272357b33b4a4ba796c2.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:272357b33b4a4ba796c2ab3a8a07636b007e81aa700185ea8dc57880e67a7a32
+size 988137
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23700_4a74cac700d312909efa.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23700_4a74cac700d312909efa.png
new file mode 100644
index 0000000000000000000000000000000000000000..853e093bf4bb97f6b3d314ad2a4d7fd934972038
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23700_4a74cac700d312909efa.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4a74cac700d312909efa52058506babf28f25e5d22c1c8fdcb50dd9b81fcb4c9
+size 1064277
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23700_aa048a5b9d53bea5d477.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23700_aa048a5b9d53bea5d477.png
new file mode 100644
index 0000000000000000000000000000000000000000..5a8b1066f311ce2ea43d57b112d6c0632ea18aff
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23700_aa048a5b9d53bea5d477.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:aa048a5b9d53bea5d477fd59ab5b131dff96086e305ec39bea03fe08aeae3f82
+size 447236
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23700_aae21517d1c3e27ae218.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23700_aae21517d1c3e27ae218.png
new file mode 100644
index 0000000000000000000000000000000000000000..addb3085240af307ab7bc5033f1f9d467defca16
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23700_aae21517d1c3e27ae218.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:aae21517d1c3e27ae2187989ea84343f9cd4829c91e1e6ba9815f6bd4d0d49ae
+size 574343
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23900_3bca7e7d8d7b6220e207.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23900_3bca7e7d8d7b6220e207.png
new file mode 100644
index 0000000000000000000000000000000000000000..517a8885a5cb369dbe0c2c0d69d797175d2d15c1
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23900_3bca7e7d8d7b6220e207.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3bca7e7d8d7b6220e2078ce9fbd1ee1db9cdfd3f0e015fe876c85bd193d114ca
+size 895450
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23900_6ed37f4bcd72770c5c55.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23900_6ed37f4bcd72770c5c55.png
new file mode 100644
index 0000000000000000000000000000000000000000..b9df315b34c96669f72851319e523fa07b68705e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23900_6ed37f4bcd72770c5c55.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6ed37f4bcd72770c5c553ed0f5ebc3ad122487bd6077b3d8e633fd43f41c38b1
+size 572224
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23900_bb955f6ab6921b65a026.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23900_bb955f6ab6921b65a026.png
new file mode 100644
index 0000000000000000000000000000000000000000..0d53796c47f6a877e94d5c5acb6ce4993214bbdc
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23900_bb955f6ab6921b65a026.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:bb955f6ab6921b65a0266a52dc4842d2d21f4c60302a54512fcf05a5d70c5c60
+size 587187
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23900_c49d203cedc5b7b31976.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23900_c49d203cedc5b7b31976.png
new file mode 100644
index 0000000000000000000000000000000000000000..eab44e4d3b28fcab84d700d87d0f009862c3c8e9
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_23900_c49d203cedc5b7b31976.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c49d203cedc5b7b31976f1f414d4a2088ed9ababfec4d1c9ba0e58508052b042
+size 1079295
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24100_4fa5b561be2bf1d2ac13.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24100_4fa5b561be2bf1d2ac13.png
new file mode 100644
index 0000000000000000000000000000000000000000..3d33e971dee6f46cc1e53c5e36078e365adc2a9c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24100_4fa5b561be2bf1d2ac13.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4fa5b561be2bf1d2ac13cf70bd4962880f3430fa649decb5922f216bc269d515
+size 796587
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24100_7d1876f706c61b09bf71.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24100_7d1876f706c61b09bf71.png
new file mode 100644
index 0000000000000000000000000000000000000000..1e7b701e34ce73073c718cc957be312a0d1494c9
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24100_7d1876f706c61b09bf71.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7d1876f706c61b09bf71f05d6797e5a65455ca2b74d825ea6dc61d3b54ce3a81
+size 1062189
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24100_877d1785fe2221575f8c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24100_877d1785fe2221575f8c.png
new file mode 100644
index 0000000000000000000000000000000000000000..b058a3c461b5b59fa8502c08ffc504d455804acf
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24100_877d1785fe2221575f8c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:877d1785fe2221575f8c5b18f77e7b33819e6c510daa951382a8d101f91361c5
+size 428027
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24100_ed71fb216d4983031f69.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24100_ed71fb216d4983031f69.png
new file mode 100644
index 0000000000000000000000000000000000000000..2097c727dd4bdc28d9980cc43f291265fc73cadb
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24100_ed71fb216d4983031f69.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ed71fb216d4983031f69fffacc0eb3c42730068a302b45ab513ecd4a348119c3
+size 1011715
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24300_6a5746243076b15ff537.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24300_6a5746243076b15ff537.png
new file mode 100644
index 0000000000000000000000000000000000000000..209cb1fb92436cf2219b79e1e77ad6b5e2629912
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24300_6a5746243076b15ff537.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6a5746243076b15ff5374bf427e0789d1250529c07ad7fde0513d3ff1693e2e0
+size 328589
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24300_8e76eb59063d1a43eb99.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24300_8e76eb59063d1a43eb99.png
new file mode 100644
index 0000000000000000000000000000000000000000..4ade04f996b0b5d801d3eacc47dd32ee1d9ca428
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24300_8e76eb59063d1a43eb99.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8e76eb59063d1a43eb998dd8d9d05dc6c9b28113f83e75e0cfe6f9cf78fd9c0d
+size 1047235
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24300_8f917b609a279e908c6b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24300_8f917b609a279e908c6b.png
new file mode 100644
index 0000000000000000000000000000000000000000..5b6da3103191a775cbe4266434b6a9ab46e76807
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24300_8f917b609a279e908c6b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8f917b609a279e908c6b5608ffaffec2e11be27a229d115a117cc4e7336d57aa
+size 615222
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24300_efef5877d7d71b171f6d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24300_efef5877d7d71b171f6d.png
new file mode 100644
index 0000000000000000000000000000000000000000..20865400359a1cd5bc723b33ba3b803aa026222c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24300_efef5877d7d71b171f6d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:efef5877d7d71b171f6dd48b93ab4a0ccd0c1c05fd6f25d26d9c90d721d628cb
+size 260948
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24500_26b94a4e06ff65228b21.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24500_26b94a4e06ff65228b21.png
new file mode 100644
index 0000000000000000000000000000000000000000..bab63359b6a63a2227f090cfe856bdc4c3a6a1a0
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24500_26b94a4e06ff65228b21.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:26b94a4e06ff65228b215ee01159b99fbde8b825766d9d158ea6a108d384fe57
+size 264634
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24500_3c6df8a99de8e4e59f89.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24500_3c6df8a99de8e4e59f89.png
new file mode 100644
index 0000000000000000000000000000000000000000..763e272d6ad67a4157a7d30a4c46dd519037c003
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24500_3c6df8a99de8e4e59f89.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3c6df8a99de8e4e59f890f0cb8c1af41f3bbe11212664432b2621324fadd8030
+size 405211
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24500_4e239f8b973c92e1d1ab.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24500_4e239f8b973c92e1d1ab.png
new file mode 100644
index 0000000000000000000000000000000000000000..8ed03f0acfd703bed0f48e860270b7f6d3157cf3
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24500_4e239f8b973c92e1d1ab.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4e239f8b973c92e1d1abfa8a0d55784d2826cfd50a248d069297c65ce656452a
+size 1413682
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24500_fc9bcbd958a24eb27aff.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24500_fc9bcbd958a24eb27aff.png
new file mode 100644
index 0000000000000000000000000000000000000000..b16d9e2a66e6bc1a492d97a5ff50d2a13372c869
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24500_fc9bcbd958a24eb27aff.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:fc9bcbd958a24eb27aff81271d6a286eaf0d21af391cd882fb2f9c18ca8d5af3
+size 637322
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24700_29aa76a3d8904e43aec6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24700_29aa76a3d8904e43aec6.png
new file mode 100644
index 0000000000000000000000000000000000000000..2c0799bf1505f65923c8ef1f2a09e801b414086a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24700_29aa76a3d8904e43aec6.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:29aa76a3d8904e43aec62c03be076a20265f6b07a71878f7f55c2c923a5f35f7
+size 817533
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24700_3bee0f4ee82bd166d217.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24700_3bee0f4ee82bd166d217.png
new file mode 100644
index 0000000000000000000000000000000000000000..c62cc93882985e01fb07410a7e4e04bf5257a4fa
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24700_3bee0f4ee82bd166d217.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3bee0f4ee82bd166d21727cb4979c32547b99e0625fb777159f3fa195286d5f7
+size 957337
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24700_9bb14012810a0183580d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24700_9bb14012810a0183580d.png
new file mode 100644
index 0000000000000000000000000000000000000000..8145c9047922913a42ddfbcf4924e4153fdc1abe
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24700_9bb14012810a0183580d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9bb14012810a0183580db2c16676427c12763090ff3be87efe7e434a2b12c527
+size 546212
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24700_ba60075af7eb455ca384.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24700_ba60075af7eb455ca384.png
new file mode 100644
index 0000000000000000000000000000000000000000..d96417d861dc33de17e6420a2d902505307baca1
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24700_ba60075af7eb455ca384.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ba60075af7eb455ca38451d864bac74e3c4af11679ba87aefbc6a77af7721a00
+size 363748
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24900_19eb42e164fabeca6798.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24900_19eb42e164fabeca6798.png
new file mode 100644
index 0000000000000000000000000000000000000000..2344bf2da40cc3ac7a34b4b42647934b7553bfdb
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24900_19eb42e164fabeca6798.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:19eb42e164fabeca6798a8cab66df4db0ee025fefe254ce15fdeef1040338ab5
+size 772419
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24900_2662a50f1f52fd10426a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24900_2662a50f1f52fd10426a.png
new file mode 100644
index 0000000000000000000000000000000000000000..85519ab0450a5a10238fc50f55e4c0c8581d86cd
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24900_2662a50f1f52fd10426a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2662a50f1f52fd10426ad67d39d4638c1a58d49a69f9de92098cde00edbeb874
+size 910417
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24900_899e55d06746dac9171b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24900_899e55d06746dac9171b.png
new file mode 100644
index 0000000000000000000000000000000000000000..33dc3e76dcc22bf30f8561bba518ec8b2da15f3e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24900_899e55d06746dac9171b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:899e55d06746dac9171b33352247568c6afab0616b687c527ba2852a24360fa9
+size 523950
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24900_c6dad060efe4400f07d1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24900_c6dad060efe4400f07d1.png
new file mode 100644
index 0000000000000000000000000000000000000000..198d549511f5a9d3e021cb79d3d591dc6c356c5b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_24900_c6dad060efe4400f07d1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c6dad060efe4400f07d1a9fc69d3cf0905dde968b4eedfff66d88e1e1d700e6c
+size 452977
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2500_1fdfc014878b2a25e72d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2500_1fdfc014878b2a25e72d.png
new file mode 100644
index 0000000000000000000000000000000000000000..f58fd1d1327516107a96e1a77e376f2686076b3a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2500_1fdfc014878b2a25e72d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1fdfc014878b2a25e72d1866e892f42bb8955eeb313b52b8e50c1ba1d847ab91
+size 194409
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2500_3e438c6e02dab923ec72.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2500_3e438c6e02dab923ec72.png
new file mode 100644
index 0000000000000000000000000000000000000000..49288aa1f85e4795338b6e435ccf05b3fb1d3443
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2500_3e438c6e02dab923ec72.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3e438c6e02dab923ec723db4f94f271422450093efbce4c2a6b3bd3248d6296b
+size 100684
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2500_73025c3d4f7f29b88a26.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2500_73025c3d4f7f29b88a26.png
new file mode 100644
index 0000000000000000000000000000000000000000..05597270cdff49143c97980cc83a1da262c33db5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2500_73025c3d4f7f29b88a26.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:73025c3d4f7f29b88a26cd07192c0efb4dbd2e33e92e248d9409eecf8cffccf6
+size 156580
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2500_f36ba9d7c1d2ee6c9c17.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2500_f36ba9d7c1d2ee6c9c17.png
new file mode 100644
index 0000000000000000000000000000000000000000..c91aeaa2e8e1cfeb5b4a74dbc9e69bf1bbfd529a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2500_f36ba9d7c1d2ee6c9c17.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f36ba9d7c1d2ee6c9c178079a5b62bcbaa79cb53cb350262d8af660c1804f3b8
+size 189367
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25100_1719a890ec682d001663.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25100_1719a890ec682d001663.png
new file mode 100644
index 0000000000000000000000000000000000000000..09122b3e5274f0bec92ed1d46f8b5f5894abe99d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25100_1719a890ec682d001663.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1719a890ec682d001663255434d2800ec6bd11734108f20bfa91cc2c9a4fd298
+size 1173818
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25100_3b79f7f31747a360e310.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25100_3b79f7f31747a360e310.png
new file mode 100644
index 0000000000000000000000000000000000000000..acb0ff3d7cc473ebd93f3d61915759293c4f7534
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25100_3b79f7f31747a360e310.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3b79f7f31747a360e310d6ea95259eb12e6716d937c81fc478df124f346053ff
+size 528899
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25100_4ee5509ea94305e6fe3c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25100_4ee5509ea94305e6fe3c.png
new file mode 100644
index 0000000000000000000000000000000000000000..0ac596f2dd4e531ea195578c91d5cf076afdb9b2
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25100_4ee5509ea94305e6fe3c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4ee5509ea94305e6fe3c431bb4f0613230678d51332a4279867582f6445be9e3
+size 1095072
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25100_c20a89e6205e9d1f5985.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25100_c20a89e6205e9d1f5985.png
new file mode 100644
index 0000000000000000000000000000000000000000..fd08eaa0b726122b54af4559a729369c27d4ca5a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25100_c20a89e6205e9d1f5985.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c20a89e6205e9d1f59853dbec65cb1eeae41e1ca5862294c9a2ae8560f5b10d9
+size 300635
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25300_398580bf2a437cbff7d9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25300_398580bf2a437cbff7d9.png
new file mode 100644
index 0000000000000000000000000000000000000000..998feef73cc6f8f41da5314ca47902fae4130d96
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25300_398580bf2a437cbff7d9.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:398580bf2a437cbff7d933a3cd5a13a9e56256047a77492c9412513473d141a4
+size 660425
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25300_bfd262d222f4f574f40a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25300_bfd262d222f4f574f40a.png
new file mode 100644
index 0000000000000000000000000000000000000000..624aa2b748b113a68c0d1f0ed1cb10f1e6b7471d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25300_bfd262d222f4f574f40a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:bfd262d222f4f574f40a7b30f4e3afce844816c8b00bae3b5e49c893eddfc3d3
+size 688063
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25300_c6d5a5b977c72846a948.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25300_c6d5a5b977c72846a948.png
new file mode 100644
index 0000000000000000000000000000000000000000..f1ab47adfbbc61c178221a729d3a9c9f770dd540
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25300_c6d5a5b977c72846a948.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c6d5a5b977c72846a948a47cacf8490c321a5d074f0f05cdc2abc805e4eedd15
+size 946572
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25300_eb533d43565088428ad3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25300_eb533d43565088428ad3.png
new file mode 100644
index 0000000000000000000000000000000000000000..c9b4698d152ccfb7caa8c4cf4984c00f8ad4a16c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25300_eb533d43565088428ad3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:eb533d43565088428ad3c1938f53c0f792e15cf3f00e0f6673cd22cb0d03f96c
+size 282602
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25500_522ea0b42f5f680fcb91.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25500_522ea0b42f5f680fcb91.png
new file mode 100644
index 0000000000000000000000000000000000000000..a8d0b0b221311bf6b2154d18476e68e6418b7f05
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25500_522ea0b42f5f680fcb91.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:522ea0b42f5f680fcb9160dccf2c215791a11e70bed9755739aa4addf3646a7e
+size 828997
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25500_852258477a5777c4de47.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25500_852258477a5777c4de47.png
new file mode 100644
index 0000000000000000000000000000000000000000..cd04907270460546219c40e85ee0af6f96363230
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25500_852258477a5777c4de47.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:852258477a5777c4de47b71d2fbc0964c858342e2e7e43b1820ac661151b4968
+size 1879115
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25500_a90b2e9ecf325e806e9f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25500_a90b2e9ecf325e806e9f.png
new file mode 100644
index 0000000000000000000000000000000000000000..114461d896b8c29bf1a7aaad689ddb464c1d6d0f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25500_a90b2e9ecf325e806e9f.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a90b2e9ecf325e806e9fc9f07f2946e356a54be8e60c6559d17539dfaf2a2102
+size 608177
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25500_ca6273893161988f63f3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25500_ca6273893161988f63f3.png
new file mode 100644
index 0000000000000000000000000000000000000000..633506b779a9bda66d824ff70b6cd131aca8423a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25500_ca6273893161988f63f3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ca6273893161988f63f36a35de74fc71de9c38ff49f9c4e59897bcc7c32489e5
+size 1103318
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25700_18a152954f1889505def.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25700_18a152954f1889505def.png
new file mode 100644
index 0000000000000000000000000000000000000000..474a796ce8ea4e589635c3abf42a01bc0c021514
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25700_18a152954f1889505def.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:18a152954f1889505def5bdc227b2fdc4f8a1daf133030169255090247651383
+size 364963
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25700_b30bfc6898f657cc15e0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25700_b30bfc6898f657cc15e0.png
new file mode 100644
index 0000000000000000000000000000000000000000..df07950249de7bf9935dbab301cae6e84ed1e98a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25700_b30bfc6898f657cc15e0.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b30bfc6898f657cc15e0e7b58728c8bd095ad0a8338c7afe6b87adf1d61459ea
+size 969761
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25700_c1f4758093981325aafa.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25700_c1f4758093981325aafa.png
new file mode 100644
index 0000000000000000000000000000000000000000..6b7e05e3f0fec77bfc74422c7d5a607d14fbc37e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25700_c1f4758093981325aafa.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c1f4758093981325aafa281d5486a3edb383ce5ed6411ecdd9557f2aa8d02394
+size 654207
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25700_f043ecf37bf36f896f26.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25700_f043ecf37bf36f896f26.png
new file mode 100644
index 0000000000000000000000000000000000000000..2d72e9e6822a99c2d4399dfc6c4b8bf215160841
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25700_f043ecf37bf36f896f26.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f043ecf37bf36f896f26ba7ceedffeb3c49b9f0204ce422a9b78c1e1c9842024
+size 418199
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25900_45c00d7e9d45830fcca1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25900_45c00d7e9d45830fcca1.png
new file mode 100644
index 0000000000000000000000000000000000000000..a47716847efa2364d1ea0cfcd3478ffacd906c9d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25900_45c00d7e9d45830fcca1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:45c00d7e9d45830fcca1bec22103d3eac40723b15dad76de4cca716b16baada5
+size 701596
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25900_cce83cb8cd0ba9a6b36c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25900_cce83cb8cd0ba9a6b36c.png
new file mode 100644
index 0000000000000000000000000000000000000000..ff64323bae16828a6dfd97b4468c513a3bd0cf52
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25900_cce83cb8cd0ba9a6b36c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:cce83cb8cd0ba9a6b36c7f6bce449b83ec37ec4b0f937a581627344d74b50a9c
+size 852927
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25900_f1201b07722db0e7b0d4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25900_f1201b07722db0e7b0d4.png
new file mode 100644
index 0000000000000000000000000000000000000000..d3ff4c37d387fd9f49930c01a9731ab17c46e831
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25900_f1201b07722db0e7b0d4.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f1201b07722db0e7b0d4a0d189b4bddc676c09bad96471754ed847bf0c270b2f
+size 997528
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25900_f9a5f5be3a5407776c1b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25900_f9a5f5be3a5407776c1b.png
new file mode 100644
index 0000000000000000000000000000000000000000..07def408e986f25db7dd8a08ceae1bd489958951
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_25900_f9a5f5be3a5407776c1b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f9a5f5be3a5407776c1b26043ac961ca934771160c078efb7f20a6a3e919f532
+size 1220767
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26100_44d68fcbddb0bcc336eb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26100_44d68fcbddb0bcc336eb.png
new file mode 100644
index 0000000000000000000000000000000000000000..0141ce9b53f30317bf66fc433c0b4e8772cbbc93
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26100_44d68fcbddb0bcc336eb.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:44d68fcbddb0bcc336eb14271d13cb5ab38ad5eda876db0731160edc0e08a12a
+size 596016
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26100_47bce60e3f254a213bdb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26100_47bce60e3f254a213bdb.png
new file mode 100644
index 0000000000000000000000000000000000000000..0436b6dcb9e9cddf75e8551254c2f3b719511a1b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26100_47bce60e3f254a213bdb.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:47bce60e3f254a213bdb956dbb531d00258b671e127ccd9e8acc746021850491
+size 653434
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26100_b3043f7840b8051566ad.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26100_b3043f7840b8051566ad.png
new file mode 100644
index 0000000000000000000000000000000000000000..8a4be642922ea533636f17fcb389e19e7a048ae8
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26100_b3043f7840b8051566ad.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b3043f7840b8051566adacf364de47f5d9f7bb9cf100c8c91736346c95c196ce
+size 993778
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26100_f744a1802af048355b05.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26100_f744a1802af048355b05.png
new file mode 100644
index 0000000000000000000000000000000000000000..0deb20e3987f2781c2f73d77db998acd3a6362fe
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26100_f744a1802af048355b05.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f744a1802af048355b05fea90667e84b5f1b657313da2048ea1ac935208ffb26
+size 1423537
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26300_2d3670b5c455c4932e71.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26300_2d3670b5c455c4932e71.png
new file mode 100644
index 0000000000000000000000000000000000000000..1f12195132bb97b026a2491a5aed31fb6ed170ee
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26300_2d3670b5c455c4932e71.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2d3670b5c455c4932e715f734ff50dec6215f85da6d37a1e192e0f5a5f7842cf
+size 1081883
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26300_4c4c37e5789a42954512.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26300_4c4c37e5789a42954512.png
new file mode 100644
index 0000000000000000000000000000000000000000..2f639673a23fe480e0ab53928752122021b28f98
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26300_4c4c37e5789a42954512.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4c4c37e5789a4295451230debdc6224f6eed34f1c193e314bcddfdae45728d61
+size 1097131
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26300_5cf2b63439c2b039e992.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26300_5cf2b63439c2b039e992.png
new file mode 100644
index 0000000000000000000000000000000000000000..2100a2645710f0a9a86de914bd27272aa7787258
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26300_5cf2b63439c2b039e992.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26300_5f5286b273aefb706279.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26300_5f5286b273aefb706279.png
new file mode 100644
index 0000000000000000000000000000000000000000..a242a2995b2e3ebe8db942a2eb8d8d89d0a9b19f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26300_5f5286b273aefb706279.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5f5286b273aefb7062791f3271af2282347a6ed41729cc5b48a64ce786e710d7
+size 632508
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26500_2eddd902c71c7db2e8a5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26500_2eddd902c71c7db2e8a5.png
new file mode 100644
index 0000000000000000000000000000000000000000..2ddf557467692e7de7a37d7061a422c963873cae
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26500_2eddd902c71c7db2e8a5.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2eddd902c71c7db2e8a5fdc79c0aca21ac850872277d24bf4d3e60858dfca367
+size 809865
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26500_78e9c60d8d6bc939a318.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26500_78e9c60d8d6bc939a318.png
new file mode 100644
index 0000000000000000000000000000000000000000..8b64a23edb1c364fb43d05b7a87a40ef76af53ab
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26500_78e9c60d8d6bc939a318.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:78e9c60d8d6bc939a3180f88b57202a678d9e2620237022277c870738cfd10d0
+size 524995
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26500_8098f0b4bc8071a8f447.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26500_8098f0b4bc8071a8f447.png
new file mode 100644
index 0000000000000000000000000000000000000000..b70b8b520dfa65301a00f7b84c4675d25928a717
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26500_8098f0b4bc8071a8f447.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8098f0b4bc8071a8f447e98cbff860bf8daf50bf05d3c46d274e15ea2c8424a6
+size 248263
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26500_efbb31d261adcb12c049.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26500_efbb31d261adcb12c049.png
new file mode 100644
index 0000000000000000000000000000000000000000..6522de855e54054467f11f2bde034e1db71a216a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26500_efbb31d261adcb12c049.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:efbb31d261adcb12c0494bf1f2e433171abb1354fb61bc7b4f9565ff1cddb224
+size 993114
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26700_12f7774bea618b3e2f2d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26700_12f7774bea618b3e2f2d.png
new file mode 100644
index 0000000000000000000000000000000000000000..3180d6efd0c5e36da703ac44745fc582ccb3843c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26700_12f7774bea618b3e2f2d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:12f7774bea618b3e2f2d3ce0947f8a07f4f6f86c600cdde6ab81f17c81c1b7b5
+size 571618
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26700_543ca555cf38b5453c92.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26700_543ca555cf38b5453c92.png
new file mode 100644
index 0000000000000000000000000000000000000000..9f644c8756fa9d09a781dd9d4d196ec466dc95ae
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26700_543ca555cf38b5453c92.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:543ca555cf38b5453c9280cd1d209f314f7938db664d64090c3fe7e17bc441be
+size 768060
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26700_81d27f75f0be5a4cdae8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26700_81d27f75f0be5a4cdae8.png
new file mode 100644
index 0000000000000000000000000000000000000000..58999545b8b2d3527460bf4c4c72781e2fd5e50e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26700_81d27f75f0be5a4cdae8.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:81d27f75f0be5a4cdae836ba1fd5d23c26ef8c6f4454dfebcc3e74d9db14e31e
+size 950039
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26700_86fb5066d3ebcbba4320.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26700_86fb5066d3ebcbba4320.png
new file mode 100644
index 0000000000000000000000000000000000000000..63ece6b477312c7ffa2c849f09c4e18274f07031
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26700_86fb5066d3ebcbba4320.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:86fb5066d3ebcbba432039fbdc8f3c436af52f1a1bd330f1e7057e23b475152a
+size 1214711
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26900_56a242b016678f381787.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26900_56a242b016678f381787.png
new file mode 100644
index 0000000000000000000000000000000000000000..1f9c455d30f87f9caef2b13e611ddb646765fa89
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26900_56a242b016678f381787.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:56a242b016678f381787cec87920ec753b9f09babef2aa093d2636bd93de11e7
+size 1008934
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26900_5f0143c341a73b1b25e1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26900_5f0143c341a73b1b25e1.png
new file mode 100644
index 0000000000000000000000000000000000000000..3ef0fdf202252504ddfb560b5151809e9d54daa6
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26900_5f0143c341a73b1b25e1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5f0143c341a73b1b25e1ead3248cdeb37dec9766f29f9545665a57a40afdf866
+size 600729
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26900_7110e161e826002bf1da.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26900_7110e161e826002bf1da.png
new file mode 100644
index 0000000000000000000000000000000000000000..239dcc8231ac14cc22eb08bb0cc8afae1ab49ae6
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26900_7110e161e826002bf1da.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7110e161e826002bf1da074504b21ee95ee6bc4ce76b52394a5c7ae2432c31a9
+size 644958
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26900_73f90b5c699318f8032c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26900_73f90b5c699318f8032c.png
new file mode 100644
index 0000000000000000000000000000000000000000..b3d9c2921aef5cc6a48c6f681f7f39d6637830e2
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_26900_73f90b5c699318f8032c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:73f90b5c699318f8032c4045fb3ca522c48f62a594503cba2c86891c878cee69
+size 1223547
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2700_0b26dad94225d56f3df7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2700_0b26dad94225d56f3df7.png
new file mode 100644
index 0000000000000000000000000000000000000000..bb7980c672644945ae087850b76a3562435d7166
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2700_0b26dad94225d56f3df7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0b26dad94225d56f3df76182c94b95a7549bfc05bced62f863aea05de5075143
+size 339464
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2700_303b11f968ddf9168dc7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2700_303b11f968ddf9168dc7.png
new file mode 100644
index 0000000000000000000000000000000000000000..30cd6659e4e986353d57464bdea9d9c04ce14825
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2700_303b11f968ddf9168dc7.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2700_b664909a42c5e85985d4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2700_b664909a42c5e85985d4.png
new file mode 100644
index 0000000000000000000000000000000000000000..e54fa03c7322b82e8e68b2f10305779cfd54e59f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2700_b664909a42c5e85985d4.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b664909a42c5e85985d4104c4446dfaed0fd7ab95d9c92b2f2062c0ad590a9e8
+size 936451
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2700_bd360fe99e464454000b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2700_bd360fe99e464454000b.png
new file mode 100644
index 0000000000000000000000000000000000000000..a93429e178205f314a717b135d928d1742e8c0bd
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2700_bd360fe99e464454000b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:bd360fe99e464454000b96542762e8cffb7931da641d97e6e62fe45a6f779960
+size 196727
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27100_383326195417cc6f21b9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27100_383326195417cc6f21b9.png
new file mode 100644
index 0000000000000000000000000000000000000000..8bd2d25fabc0b036c839af4c094172b2fd9fda42
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27100_383326195417cc6f21b9.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:383326195417cc6f21b9d3f37d6f34053ea9dbe53c09205bd03129c5a66e8094
+size 248832
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27100_5ef1653a85f8605c8eb1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27100_5ef1653a85f8605c8eb1.png
new file mode 100644
index 0000000000000000000000000000000000000000..6a1daf5e149a32f62345ad40c8ee0ec9ab84a0fd
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27100_5ef1653a85f8605c8eb1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5ef1653a85f8605c8eb15022f49aa2a0130ce14efb3234a58b6fc62c2eeeb3c4
+size 1142529
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27100_87fa8f2281d9ed3e2c7f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27100_87fa8f2281d9ed3e2c7f.png
new file mode 100644
index 0000000000000000000000000000000000000000..4284bef17b105ff269a16b8ae8261ce137afe1eb
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27100_87fa8f2281d9ed3e2c7f.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:87fa8f2281d9ed3e2c7fe2283b623ba376f43dc724646a9bd5d965bd593575fd
+size 281694
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27100_f25d25631ee0e6a2aa24.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27100_f25d25631ee0e6a2aa24.png
new file mode 100644
index 0000000000000000000000000000000000000000..c0dd198fc717b3ae9671523003ec2c21073e47a1
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27100_f25d25631ee0e6a2aa24.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f25d25631ee0e6a2aa2421acba71b5f68b9659ec2ec497c457d6f2c0fd3439e9
+size 1053392
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27300_327ef47981cffcec7ca2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27300_327ef47981cffcec7ca2.png
new file mode 100644
index 0000000000000000000000000000000000000000..3eb7db5a91e1598a21aec42475505f614ee00787
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27300_327ef47981cffcec7ca2.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:327ef47981cffcec7ca25d33cfb6f1f77bfa6956bd03a856e60bd8e4e690b9b0
+size 1015035
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27300_3cbe0bff01e6a9ea076b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27300_3cbe0bff01e6a9ea076b.png
new file mode 100644
index 0000000000000000000000000000000000000000..6e1c877277053ca15b0446d4a14c16798d526f8a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27300_3cbe0bff01e6a9ea076b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3cbe0bff01e6a9ea076b15a3e3a85dc17776c917c9e9140eddca605809c1333a
+size 922972
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27300_6460aeff31ebcf32fbc6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27300_6460aeff31ebcf32fbc6.png
new file mode 100644
index 0000000000000000000000000000000000000000..0ebfa6559728786775585ef46e01f6cdaffc8eae
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27300_6460aeff31ebcf32fbc6.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6460aeff31ebcf32fbc6b8b0e94cbb29bbb2dc877f9655a4e9e011aaaaea9a52
+size 542940
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27300_8ba8bc45264e2633603d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27300_8ba8bc45264e2633603d.png
new file mode 100644
index 0000000000000000000000000000000000000000..679f3bdb9e73ffc8a467f5aaa51d79d22ef93819
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27300_8ba8bc45264e2633603d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8ba8bc45264e2633603d90c18f8149f8092ea2fa069d2dcd785cd4f90aa4d312
+size 614410
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27500_185c8fb895014b147dc1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27500_185c8fb895014b147dc1.png
new file mode 100644
index 0000000000000000000000000000000000000000..11e17fe3eb59fdfbc791a0bb15164c22370eb6fc
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27500_185c8fb895014b147dc1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:185c8fb895014b147dc1c29d74e540849ac1fe7c0cae4148d060bb8bfea22970
+size 981429
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27500_4e891fa8ae94bca95c2f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27500_4e891fa8ae94bca95c2f.png
new file mode 100644
index 0000000000000000000000000000000000000000..29877e4db3e712e7083ebe42038fb0fbb289fbda
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27500_4e891fa8ae94bca95c2f.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4e891fa8ae94bca95c2f3c823066ba0faa850b695615503c0e4ac63ffd56cbba
+size 763448
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27500_57a9cc9d87ca1c858982.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27500_57a9cc9d87ca1c858982.png
new file mode 100644
index 0000000000000000000000000000000000000000..173a74e3f3d710eaabe9e6e5b6b5629283ec2599
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27500_57a9cc9d87ca1c858982.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:57a9cc9d87ca1c85898255c2f05909374fab8b934c8962361c31d98dc5626c39
+size 641481
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27500_98fa6aa08be970945a6b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27500_98fa6aa08be970945a6b.png
new file mode 100644
index 0000000000000000000000000000000000000000..9bc3d225e62fe98bbd411b1989af37fbb01d0551
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27500_98fa6aa08be970945a6b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:98fa6aa08be970945a6b5e1f5c209264a2edcae6c1180524b3b283ca7867c2b3
+size 1054144
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27700_8329dfe433afa2c1a387.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27700_8329dfe433afa2c1a387.png
new file mode 100644
index 0000000000000000000000000000000000000000..f3dd2bc80abea5aefd926b9e81e89e05904de4b8
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27700_8329dfe433afa2c1a387.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8329dfe433afa2c1a387a20143fe61b8613744f5d484e1ec2917736620ff3bf0
+size 752402
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27700_9a8b545e66c9a1436b24.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27700_9a8b545e66c9a1436b24.png
new file mode 100644
index 0000000000000000000000000000000000000000..22179e524d5a7d0669b1d1907244b05e0d6ec4a1
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27700_9a8b545e66c9a1436b24.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9a8b545e66c9a1436b24780cf60a0f2e347165fb54418bc35d67b67586f95806
+size 487711
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27700_a5276d45a3fa0f928bbe.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27700_a5276d45a3fa0f928bbe.png
new file mode 100644
index 0000000000000000000000000000000000000000..011c789c1fea1808e27b6c5c4ff62ea6b835378b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27700_a5276d45a3fa0f928bbe.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a5276d45a3fa0f928bbecb048828d06f63756d9877b07fe796b09421bbf54e5b
+size 1371235
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27700_bc480818fdffd16febcd.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27700_bc480818fdffd16febcd.png
new file mode 100644
index 0000000000000000000000000000000000000000..ff7a10dd13145826a9672a10c632406869a04106
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27700_bc480818fdffd16febcd.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:bc480818fdffd16febcd3a4e1269a43b976091dd60cb59eed0f30da77bbce544
+size 1292818
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27900_18409ad5031262d950d9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27900_18409ad5031262d950d9.png
new file mode 100644
index 0000000000000000000000000000000000000000..d35c7e7e8aa614b509caa58d02c8c4e6192af79d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27900_18409ad5031262d950d9.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:18409ad5031262d950d98126d901887e16b500a75114036e904a9a2b1162c0e3
+size 635500
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27900_222bea434a3b68a696f6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27900_222bea434a3b68a696f6.png
new file mode 100644
index 0000000000000000000000000000000000000000..2509566ba5f5813e78fc6e51b06fb0aec9a020cf
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27900_222bea434a3b68a696f6.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:222bea434a3b68a696f61310e968f624231a49c5697ff07a605bf08ffd3cd8f5
+size 805830
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27900_3484c0f9897f72dcd14b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27900_3484c0f9897f72dcd14b.png
new file mode 100644
index 0000000000000000000000000000000000000000..e206a001e651beedfd94ced5704c5f315ddc7808
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27900_3484c0f9897f72dcd14b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3484c0f9897f72dcd14b589caec47157d704ad6daa9fdac46acbf47c29b50c45
+size 871587
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27900_6f3790a009bcadd05049.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27900_6f3790a009bcadd05049.png
new file mode 100644
index 0000000000000000000000000000000000000000..917c93ae9ac9aea9e8f8ed930de4eeacd3f16b48
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_27900_6f3790a009bcadd05049.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6f3790a009bcadd05049cad20940815309458683aece382c9d6a6e1aa8549635
+size 855550
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28100_3129fe2aad77f0ec5f3a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28100_3129fe2aad77f0ec5f3a.png
new file mode 100644
index 0000000000000000000000000000000000000000..a9108368da991ed3308c12e5c116322600660ef2
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28100_3129fe2aad77f0ec5f3a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3129fe2aad77f0ec5f3a4aa4dc3e5d6043e7b77d86a8fae56b6c285a6ccf3538
+size 880508
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28100_5844cd1b3c818a9a2aee.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28100_5844cd1b3c818a9a2aee.png
new file mode 100644
index 0000000000000000000000000000000000000000..e4743c4a2622d75f5ad2a253752c02797fabc90f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28100_5844cd1b3c818a9a2aee.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5844cd1b3c818a9a2aee73ab6a74acdd227dce508ef048ca041dd0c35b591227
+size 1193296
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28100_7aa72bb71d9a4d1bd216.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28100_7aa72bb71d9a4d1bd216.png
new file mode 100644
index 0000000000000000000000000000000000000000..201674d3f16baf15a1ccdfacb1cd3607eb9799d7
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28100_7aa72bb71d9a4d1bd216.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7aa72bb71d9a4d1bd216c5783489bfb98180699a971287c222ae463bde4582dd
+size 857948
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28100_e292b0361e4cfb4f27c4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28100_e292b0361e4cfb4f27c4.png
new file mode 100644
index 0000000000000000000000000000000000000000..42beea14ef09e1f1e38dfe6feece955246e89b87
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28100_e292b0361e4cfb4f27c4.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e292b0361e4cfb4f27c4129de9499f0ba83ee8031ec9609b98ab2b48ce3ff867
+size 549331
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28300_44163dc9c5c58e4f41a3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28300_44163dc9c5c58e4f41a3.png
new file mode 100644
index 0000000000000000000000000000000000000000..840c79343096b540b61d030412238ace5c7daf15
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28300_44163dc9c5c58e4f41a3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:44163dc9c5c58e4f41a327629e23b09eb9a47c21b8aae4b3f2605b16e62d2564
+size 1067020
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28300_86bd32f61e345a53f170.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28300_86bd32f61e345a53f170.png
new file mode 100644
index 0000000000000000000000000000000000000000..ff491ea131141879cff09fa045568af43455e393
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28300_86bd32f61e345a53f170.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:86bd32f61e345a53f1704874cb4f93afa0b086e9efaefb9c18006c63bb4ecd1d
+size 811019
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28300_d2bf8dbd01d0f73bf210.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28300_d2bf8dbd01d0f73bf210.png
new file mode 100644
index 0000000000000000000000000000000000000000..f9ad5b274cc1a75f4e5e038711ce2d8b5be759bf
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28300_d2bf8dbd01d0f73bf210.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d2bf8dbd01d0f73bf2109b93b48669b5cad8a508ad9daad1085dfe66a0f69206
+size 548057
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28300_edd09528abde9fb3c310.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28300_edd09528abde9fb3c310.png
new file mode 100644
index 0000000000000000000000000000000000000000..dda795e3a02ab1435eaa4b6916f394aeaec2995c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28300_edd09528abde9fb3c310.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:edd09528abde9fb3c31083f88423438d7629076c9f8e82da5b5cb1efdcc050d8
+size 678487
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28500_316ed10a96e48b26add1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28500_316ed10a96e48b26add1.png
new file mode 100644
index 0000000000000000000000000000000000000000..85e28f1847d5febf94d690c06a83e1cb87dba348
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28500_316ed10a96e48b26add1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:316ed10a96e48b26add1bd4dada6d87feeeddecf251a4916feaf925e820bc00a
+size 1457374
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28500_7e9f6c2b183a428388d1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28500_7e9f6c2b183a428388d1.png
new file mode 100644
index 0000000000000000000000000000000000000000..bd443461ff64340a82ca5e31506be934a04770e0
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28500_7e9f6c2b183a428388d1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7e9f6c2b183a428388d1545bd9e3d6447d6118c28215b8f07e20b137e2ce271e
+size 891962
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28500_f0781545f4340a2a768b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28500_f0781545f4340a2a768b.png
new file mode 100644
index 0000000000000000000000000000000000000000..798bfebbeee0e5fdcd7bc766cc4cefc0f135394c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28500_f0781545f4340a2a768b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f0781545f4340a2a768b632a06d1be154a3b8f2806fcc023b0b55f160b129e62
+size 451669
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28500_fd08f1596df2d2c1ef6b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28500_fd08f1596df2d2c1ef6b.png
new file mode 100644
index 0000000000000000000000000000000000000000..0ca2c7a8a8599af9c9c5d17b27016348d98b0632
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28500_fd08f1596df2d2c1ef6b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:fd08f1596df2d2c1ef6b1dd82cd73cee55debf17560b3483be1305d66fe6468f
+size 475284
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28700_3f3f3e561442cd896df9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28700_3f3f3e561442cd896df9.png
new file mode 100644
index 0000000000000000000000000000000000000000..3fac2d436e975e66c86b7abc544c9074c19944b2
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28700_3f3f3e561442cd896df9.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3f3f3e561442cd896df9770bfedbf131653a410d9ca511449df744749d4ebbef
+size 477657
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28700_72a8e7edf4fb349224d8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28700_72a8e7edf4fb349224d8.png
new file mode 100644
index 0000000000000000000000000000000000000000..1a496a01e98c4c7a35ee07fb55aa10c254a247e4
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28700_72a8e7edf4fb349224d8.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:72a8e7edf4fb349224d8e0431e26aa4a079336d20081ac32d01746b18db0a04e
+size 575554
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28700_9058da86b431eb56fc12.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28700_9058da86b431eb56fc12.png
new file mode 100644
index 0000000000000000000000000000000000000000..46476a34d7ed944d715221d50b80070e48565255
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28700_9058da86b431eb56fc12.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9058da86b431eb56fc123cacd546b825aa9080f015a1dc7b8b756e6708af322b
+size 677271
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28700_9584d705db5f267cf853.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28700_9584d705db5f267cf853.png
new file mode 100644
index 0000000000000000000000000000000000000000..5732a99f866cd4ce0e74bac4d282be6b94cfe579
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28700_9584d705db5f267cf853.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9584d705db5f267cf853b07d2810856c2be39ba970666e1a289242cbbac4f906
+size 1805871
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28900_0fe8970fc739339e919d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28900_0fe8970fc739339e919d.png
new file mode 100644
index 0000000000000000000000000000000000000000..ea739928e4d1723d5e00b39ed2584302d7513310
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28900_0fe8970fc739339e919d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0fe8970fc739339e919d32806d9f79a7f3469fa50249df68cff5d838af470088
+size 1095950
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28900_3bb2bf40fc9271eab41e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28900_3bb2bf40fc9271eab41e.png
new file mode 100644
index 0000000000000000000000000000000000000000..1dfeabd2149a304f8a73572425e093861d4ccc0d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28900_3bb2bf40fc9271eab41e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3bb2bf40fc9271eab41ede2912b931e5ecc0b7388ab1a9d72df44f3f4d086f79
+size 699009
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28900_3f920aa35e0cea51fb9a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28900_3f920aa35e0cea51fb9a.png
new file mode 100644
index 0000000000000000000000000000000000000000..03780e01211e318b115911f49c2e3eeff5080e3d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28900_3f920aa35e0cea51fb9a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3f920aa35e0cea51fb9af998396327a68a771a170c9d921d46d6d4ff41508e16
+size 309845
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28900_c001f00a21afcca54ac7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28900_c001f00a21afcca54ac7.png
new file mode 100644
index 0000000000000000000000000000000000000000..f1f070a848c5b600a442981eb7b6eec2b962fb2d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_28900_c001f00a21afcca54ac7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c001f00a21afcca54ac70fb72eb01a6524371998d0a383049b153404ad7ac6ce
+size 1385522
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2900_315f8fdc5c71d566b66d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2900_315f8fdc5c71d566b66d.png
new file mode 100644
index 0000000000000000000000000000000000000000..2d0762b1b9dfac874529f1ea47642c9ed1b96213
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2900_315f8fdc5c71d566b66d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:315f8fdc5c71d566b66d168da10a0d75c5a336d368efb053df52fdee17f5e624
+size 852912
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2900_743a8eaa56d7174ae20c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2900_743a8eaa56d7174ae20c.png
new file mode 100644
index 0000000000000000000000000000000000000000..9254fbab9dbc24de2477560f4e7131a43514e595
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2900_743a8eaa56d7174ae20c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:743a8eaa56d7174ae20c3f64d581d223ed485147d3b3fe429693dc7fa7fe9416
+size 505587
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2900_cceaf6186ba7b4bf9f00.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2900_cceaf6186ba7b4bf9f00.png
new file mode 100644
index 0000000000000000000000000000000000000000..1a0850be9a429b9d9611f03aeedf3ab7816d351c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2900_cceaf6186ba7b4bf9f00.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:cceaf6186ba7b4bf9f0037f01af2706e15661ffbc6fb0a278ac58dc1d5260701
+size 992306
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2900_e04d3b88a5b6d0b90da0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2900_e04d3b88a5b6d0b90da0.png
new file mode 100644
index 0000000000000000000000000000000000000000..7af61c69f1efd6a21eadedd7216127a911c81c90
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_2900_e04d3b88a5b6d0b90da0.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e04d3b88a5b6d0b90da0cf14870e836465bfe2dab110ffc3d2afeeebcf91c196
+size 713039
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29100_619a557e07c30377fe0d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29100_619a557e07c30377fe0d.png
new file mode 100644
index 0000000000000000000000000000000000000000..ba9d192a63673d5e77360768c746e5afa920bddc
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29100_619a557e07c30377fe0d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:619a557e07c30377fe0dc7ca074eae17386b482a4d2cf9149a684dc96fcdd703
+size 914337
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29100_7667a87ce7b971ba4f15.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29100_7667a87ce7b971ba4f15.png
new file mode 100644
index 0000000000000000000000000000000000000000..865e1541475bb1fa9c359d8d6ecc4f2e9bf56fc9
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29100_7667a87ce7b971ba4f15.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7667a87ce7b971ba4f15b717751775442a885fcde47498f352286565a4e80429
+size 664222
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29100_9517341185f6af8d84ce.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29100_9517341185f6af8d84ce.png
new file mode 100644
index 0000000000000000000000000000000000000000..13bf57635292d54f2dafc188dfa18a23560a8c41
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29100_9517341185f6af8d84ce.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9517341185f6af8d84cef8d31746b04ee86c1656214e3e0bd4ca6971ae0d87ce
+size 422451
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29100_a8e21cf61f35b05e957e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29100_a8e21cf61f35b05e957e.png
new file mode 100644
index 0000000000000000000000000000000000000000..93fb9186d453142cfa31691b3de937a509c75afb
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29100_a8e21cf61f35b05e957e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a8e21cf61f35b05e957e439cc6cccfac16a87b63dc2fde5c8fa6c935bd9bddc1
+size 967027
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29300_0a0d06b7b24fa3cea9f2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29300_0a0d06b7b24fa3cea9f2.png
new file mode 100644
index 0000000000000000000000000000000000000000..37115981f9e26f6a34352596ef2a0357fd2d253c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29300_0a0d06b7b24fa3cea9f2.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0a0d06b7b24fa3cea9f28bb4a19a00c0985868ba7529d3441ed58465778c47d0
+size 572238
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29300_0d2a21de3aa1320da4fa.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29300_0d2a21de3aa1320da4fa.png
new file mode 100644
index 0000000000000000000000000000000000000000..05cb6b336ac32855b6dfcf94911dd12967e692ed
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29300_0d2a21de3aa1320da4fa.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0d2a21de3aa1320da4fade392f17dea21d99705ad22e8814b9b6792ebf5c249c
+size 1193834
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29300_2e27248b02355089cfd5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29300_2e27248b02355089cfd5.png
new file mode 100644
index 0000000000000000000000000000000000000000..67c393c89db590f0af076087803c28bfbcb1a0a0
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29300_2e27248b02355089cfd5.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2e27248b02355089cfd599207fd082ad56e2c26c16993c98da9a712e38081943
+size 519313
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29300_9c98e8ae1919b770bb70.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29300_9c98e8ae1919b770bb70.png
new file mode 100644
index 0000000000000000000000000000000000000000..7519f4bc65e9a1f4b31cb0a57425be25dcde08d2
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29300_9c98e8ae1919b770bb70.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9c98e8ae1919b770bb700ae73790eb71271df0db6879a44296327f20e682a175
+size 1218215
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29500_321de29b05db436d699c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29500_321de29b05db436d699c.png
new file mode 100644
index 0000000000000000000000000000000000000000..d660357eed90adcd700c21968368079b3be9e34c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29500_321de29b05db436d699c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:321de29b05db436d699c831f7ba7968030dc3e53c961b07de42bce4139b9b78a
+size 1871897
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29500_49446d4229567ef8b0cb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29500_49446d4229567ef8b0cb.png
new file mode 100644
index 0000000000000000000000000000000000000000..c5512f7924bff4cd82394d96419863ed5465440d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29500_49446d4229567ef8b0cb.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:49446d4229567ef8b0cb57d48f99ab0b981b3be09e042cdbf2717a4c2cefd1b9
+size 687404
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29500_c0d51933a311ba0329ad.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29500_c0d51933a311ba0329ad.png
new file mode 100644
index 0000000000000000000000000000000000000000..ffa1fe484749d25075bb6da9c52141bca3638e89
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29500_c0d51933a311ba0329ad.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c0d51933a311ba0329ad486d1cc8b10212357c1f54a64c4b4b41e47362079ab2
+size 876054
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29500_d05e1840d49f66248eb3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29500_d05e1840d49f66248eb3.png
new file mode 100644
index 0000000000000000000000000000000000000000..e2c47a7861034f80563453059de2b688ea50fd3d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29500_d05e1840d49f66248eb3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d05e1840d49f66248eb3af9b65812c17db4856b72f9fdb79b2bc7d7d4130a772
+size 640078
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29700_352c6179fb99f90d563d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29700_352c6179fb99f90d563d.png
new file mode 100644
index 0000000000000000000000000000000000000000..a5ac067d821eea264a34c07ccbc88b3a13f99893
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29700_352c6179fb99f90d563d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:352c6179fb99f90d563d338fa5a08f23ee2f479cb44e8613f60bf0ada6f86ac1
+size 626748
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29700_90f6a2438f6501f74263.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29700_90f6a2438f6501f74263.png
new file mode 100644
index 0000000000000000000000000000000000000000..7395a2d03c5ba8db725dbcd09e8ab94f35c94c44
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29700_90f6a2438f6501f74263.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:90f6a2438f6501f7426357a46ce5a3d1a08d895a4192153a41d41a09570533ce
+size 631514
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29700_b80489e29bae9db580bf.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29700_b80489e29bae9db580bf.png
new file mode 100644
index 0000000000000000000000000000000000000000..26cdd6fb243badcd133f75b2097c72451aa4806c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29700_b80489e29bae9db580bf.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b80489e29bae9db580bf00c3022437142de415316c9cdbbcf8f09de9f1a5e3fe
+size 1358770
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29700_ef09a4afa234a25c9bd3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29700_ef09a4afa234a25c9bd3.png
new file mode 100644
index 0000000000000000000000000000000000000000..15366c6fe7350223a5884cffedc8b4cf12725e70
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29700_ef09a4afa234a25c9bd3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ef09a4afa234a25c9bd3a9886183bfe0677cc731744267347c50da1592494c53
+size 701216
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29900_0bac5c1794a35398e5f2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29900_0bac5c1794a35398e5f2.png
new file mode 100644
index 0000000000000000000000000000000000000000..c42b18612e29db29c0bbdc629be2d45ddf707ef3
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29900_0bac5c1794a35398e5f2.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0bac5c1794a35398e5f2f2de614c5277280419d32f8cf9da91ab3e92cd02141c
+size 685955
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29900_0ec382ea198e1d87270a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29900_0ec382ea198e1d87270a.png
new file mode 100644
index 0000000000000000000000000000000000000000..53476de13deaceabfc990fefb1652b7d82228a68
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29900_0ec382ea198e1d87270a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0ec382ea198e1d87270a58db3b68a5e914e866503c00e2216035a45bf46b2674
+size 636253
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29900_1d0801aa1b040b9ebcc1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29900_1d0801aa1b040b9ebcc1.png
new file mode 100644
index 0000000000000000000000000000000000000000..d06c48020d7e8d09b1cdae55090cec4648c47689
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29900_1d0801aa1b040b9ebcc1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1d0801aa1b040b9ebcc1bddecc119368246b84e033813d12ee45845b17ec63b0
+size 1104429
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29900_27586d01b4f0c00d1a23.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29900_27586d01b4f0c00d1a23.png
new file mode 100644
index 0000000000000000000000000000000000000000..06ec7f9fa45394365eb9d6e9235fdd4a2aafdcfe
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_29900_27586d01b4f0c00d1a23.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:27586d01b4f0c00d1a235fa02c8e602cd0095e02ec6d9b2a7040a916916c3738
+size 530986
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30100_1225aaf52beac907ec9d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30100_1225aaf52beac907ec9d.png
new file mode 100644
index 0000000000000000000000000000000000000000..7ba6a5d78b9db905504abe730f16eafa6bc947cd
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30100_1225aaf52beac907ec9d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1225aaf52beac907ec9de4a7bea39495685cafa5b8de791cfedc889563c195ec
+size 515416
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30100_4177b32f2a60f91beee7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30100_4177b32f2a60f91beee7.png
new file mode 100644
index 0000000000000000000000000000000000000000..aa1491039ccd222e4887e1ba7f63875bf6e879a4
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30100_4177b32f2a60f91beee7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4177b32f2a60f91beee7459ff44d06ad7b49a5e425aa868e5a4193b938bdb2ad
+size 949147
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30100_ad6989e54d92d4d5e617.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30100_ad6989e54d92d4d5e617.png
new file mode 100644
index 0000000000000000000000000000000000000000..a64bd2839ff3f1e91cf6fa4b27247d5cd7264f48
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30100_ad6989e54d92d4d5e617.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ad6989e54d92d4d5e61788cd0aed364a0b2577aaa4d0d247fd66da48c03fe012
+size 911819
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30100_ff6c084e8ad288eaa100.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30100_ff6c084e8ad288eaa100.png
new file mode 100644
index 0000000000000000000000000000000000000000..30c4b641bfb6cefa1fe9a739140f2dba2935495d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30100_ff6c084e8ad288eaa100.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ff6c084e8ad288eaa1007dbcc3fe44ee6232ec1aa878ddceed89893b62788d6b
+size 599657
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30300_35729245891160bcb195.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30300_35729245891160bcb195.png
new file mode 100644
index 0000000000000000000000000000000000000000..c747e45d02594263962ff5beeb375c547d06f236
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30300_35729245891160bcb195.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:35729245891160bcb19509bac7be8b1dc1a1ba2799b9eb4f43eb85b97c2e72e1
+size 252973
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30300_9e146c20defefa78e942.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30300_9e146c20defefa78e942.png
new file mode 100644
index 0000000000000000000000000000000000000000..6f6d357122d5988b5612031d10b992efa3cf578e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30300_9e146c20defefa78e942.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9e146c20defefa78e942ad6d3950bc983ab33119814581e5821e8e39a62880f7
+size 934613
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30300_c942fc8bd3ad1c6ab238.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30300_c942fc8bd3ad1c6ab238.png
new file mode 100644
index 0000000000000000000000000000000000000000..9e3a62dfb7c70c4d47a2978d4778a38166029b7f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30300_c942fc8bd3ad1c6ab238.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c942fc8bd3ad1c6ab2384f65bd23b5952873cad341ad557b9eae185a7e9e0396
+size 487616
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30300_e38a3da8c2e80661a0e8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30300_e38a3da8c2e80661a0e8.png
new file mode 100644
index 0000000000000000000000000000000000000000..d201f480181d1f14e6a9c8df9b0d9c1334525eae
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30300_e38a3da8c2e80661a0e8.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e38a3da8c2e80661a0e8ee20e007692d9ca3798928e47ab127b984a37a84d325
+size 362388
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30500_ac632ac71485aaa791d4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30500_ac632ac71485aaa791d4.png
new file mode 100644
index 0000000000000000000000000000000000000000..ec76ba17edce9c840e6a596afe4b30526beda813
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30500_ac632ac71485aaa791d4.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ac632ac71485aaa791d45433cb3456030a4a19ee25d6d74038bf66db6eb5732c
+size 483177
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30500_b8292001495601b553d7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30500_b8292001495601b553d7.png
new file mode 100644
index 0000000000000000000000000000000000000000..0d609499514b61c0aa68ba5097ab8fea1b20cf6d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30500_b8292001495601b553d7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b8292001495601b553d791797fc32e0f4a3c49dee20848bb3bcf804d56be2ee7
+size 765894
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30500_b9386cb4a65072455997.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30500_b9386cb4a65072455997.png
new file mode 100644
index 0000000000000000000000000000000000000000..64f9664a0627db23b689971398b5d38fc217b762
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30500_b9386cb4a65072455997.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b9386cb4a65072455997d49f4c6ae20ee2d13c10038dc0832eabf52d50997af4
+size 398106
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30500_c8c04d48fdd8e97dc40a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30500_c8c04d48fdd8e97dc40a.png
new file mode 100644
index 0000000000000000000000000000000000000000..6a467471414aac1e7d1bb8df0ed2010c95be69f5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30500_c8c04d48fdd8e97dc40a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c8c04d48fdd8e97dc40a89058e0f3fd0be3851c6b93f0db7ac240e81b35ab677
+size 769799
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30700_078bc900ec2b580d1b6b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30700_078bc900ec2b580d1b6b.png
new file mode 100644
index 0000000000000000000000000000000000000000..3a25ecd1251bbc4e6a5f4bf6f6c4991a15713a44
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30700_078bc900ec2b580d1b6b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:078bc900ec2b580d1b6b8b31ea20362d9b3bee6068def61d203c36c1c9b978ec
+size 609422
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30700_8c7e415db48999743962.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30700_8c7e415db48999743962.png
new file mode 100644
index 0000000000000000000000000000000000000000..59273554cef8ff3ab61a774196461977b5ae9e32
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30700_8c7e415db48999743962.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8c7e415db489997439629bcf2010c3c17b98bf9c3564feec05c03fb05faa6522
+size 1249038
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30700_94aca63f514a54fc9edf.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30700_94aca63f514a54fc9edf.png
new file mode 100644
index 0000000000000000000000000000000000000000..637c11d446e318d81d69c4c97554969f0ef6a2c5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30700_94aca63f514a54fc9edf.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:94aca63f514a54fc9edfee952ddf5a2c6e6991a9ee195ae997028a70bba7842f
+size 954022
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30700_c48c10357d226f6e202f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30700_c48c10357d226f6e202f.png
new file mode 100644
index 0000000000000000000000000000000000000000..efe4562b08cbeb5a7ba0c0bff2d027c1b454c3a5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30700_c48c10357d226f6e202f.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c48c10357d226f6e202f5e178f36deee8fb417b59a5e7c77e638fe8599d3fcc0
+size 640671
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30900_0134585fe579d63cc4c7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30900_0134585fe579d63cc4c7.png
new file mode 100644
index 0000000000000000000000000000000000000000..071575933c87d4a1b1fd59367e278e8746855ff9
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30900_0134585fe579d63cc4c7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0134585fe579d63cc4c777a8c6797c3df18e498510bd9ed26598102a394775f9
+size 1070973
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30900_4a5b7b3140b8c66af2e7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30900_4a5b7b3140b8c66af2e7.png
new file mode 100644
index 0000000000000000000000000000000000000000..6385df643687ec38fe93a7d4af3fd552e0bd9efc
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30900_4a5b7b3140b8c66af2e7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4a5b7b3140b8c66af2e711a3b6a757459ce062d1de27fa2b8a65e03711eb3656
+size 542902
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30900_5e9aca236bf81cc16c7f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30900_5e9aca236bf81cc16c7f.png
new file mode 100644
index 0000000000000000000000000000000000000000..71fe59e817c9a1f16cd1c100cf9c5e554406264b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30900_5e9aca236bf81cc16c7f.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5e9aca236bf81cc16c7f241347ef95e0e27be3431095c7847450480110111efe
+size 1201358
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30900_d495292a48eefdf19ff6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30900_d495292a48eefdf19ff6.png
new file mode 100644
index 0000000000000000000000000000000000000000..7282243687dc1e88dc96dba29ef0fb2b1b10adb1
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_30900_d495292a48eefdf19ff6.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d495292a48eefdf19ff647a18340607d9dd582b88f3ec574b4b67825bf403e34
+size 445107
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3100_282387d324567af8d807.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3100_282387d324567af8d807.png
new file mode 100644
index 0000000000000000000000000000000000000000..1d75e6dfe08c8d95d377007fa803ac64ced40da9
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3100_282387d324567af8d807.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:282387d324567af8d807446b0270cdfb5cbd9b0c40357c7e7f2fe68bcce08143
+size 841786
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3100_6e3c4a10f927ca597322.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3100_6e3c4a10f927ca597322.png
new file mode 100644
index 0000000000000000000000000000000000000000..520e9e16600d44d0430f08ba3da853390fde8726
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3100_6e3c4a10f927ca597322.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6e3c4a10f927ca5973225f0356cb717f7a6b84458f1bb38ff16a9cf01c5bd23a
+size 947233
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3100_95cbfc257b851d12b010.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3100_95cbfc257b851d12b010.png
new file mode 100644
index 0000000000000000000000000000000000000000..dd4de2b4722d0f6092ece8a43189f0b8a9f5c240
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3100_95cbfc257b851d12b010.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:95cbfc257b851d12b010a08201335c23367d082a63dd49f37c8db1baca0f489e
+size 762716
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3100_c2b061555574dec84601.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3100_c2b061555574dec84601.png
new file mode 100644
index 0000000000000000000000000000000000000000..12aa2d561f9f9592515878323e1bc8f793aa1938
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3100_c2b061555574dec84601.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c2b061555574dec8460163eb2ad4407d1b7dec70a485df77f2c7b9b5a2a5ba5b
+size 740336
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31100_0104b2f6fee02189c204.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31100_0104b2f6fee02189c204.png
new file mode 100644
index 0000000000000000000000000000000000000000..c8e3eb0bd7f97b28a388af86f0b32b03dce5ad8e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31100_0104b2f6fee02189c204.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0104b2f6fee02189c204af0514fe63dd486ca41498a25bc8e55f011d707bd592
+size 835961
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31100_073bc83c6743b6aa0415.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31100_073bc83c6743b6aa0415.png
new file mode 100644
index 0000000000000000000000000000000000000000..999889cbbabb703459baa04551a227074ea9db20
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31100_073bc83c6743b6aa0415.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:073bc83c6743b6aa0415432a8943614a2c3fd33470a66c9baa456033ed55c71a
+size 530408
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31100_3dc37e2381b06ed65d15.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31100_3dc37e2381b06ed65d15.png
new file mode 100644
index 0000000000000000000000000000000000000000..80180e0d396bfe5b2ad5ebc7448f6841ce1bee73
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31100_3dc37e2381b06ed65d15.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3dc37e2381b06ed65d155e0f93584b191b4aa18de6a852a115baa406972f26e1
+size 531255
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31100_6ae454a77a5c59b438d0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31100_6ae454a77a5c59b438d0.png
new file mode 100644
index 0000000000000000000000000000000000000000..a68dea4f573c72a957ea82e8be5ebdd170de426d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31100_6ae454a77a5c59b438d0.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6ae454a77a5c59b438d095dc8aaadc84d087aeeda8768fd95b98b0df2f0df3a9
+size 998759
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31300_524daee32005143494dc.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31300_524daee32005143494dc.png
new file mode 100644
index 0000000000000000000000000000000000000000..d4c64eb94b2a6649684e45eb1a114e07a5c5ccb4
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31300_524daee32005143494dc.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:524daee32005143494dc34943ae872204ab342722efd539cd60b4663f75b4526
+size 618459
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31300_582cad98839a9e095e70.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31300_582cad98839a9e095e70.png
new file mode 100644
index 0000000000000000000000000000000000000000..df50e54a4e895dd49cabd8b2c4e36df986740a04
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31300_582cad98839a9e095e70.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:582cad98839a9e095e700dca3ec0825f2318faf536efae153965aae0aef07568
+size 926244
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31300_83871d3b88f7f259e03d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31300_83871d3b88f7f259e03d.png
new file mode 100644
index 0000000000000000000000000000000000000000..885a2248b8e6ed6bc42a9ae0dcae63ac8d6bacf1
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31300_83871d3b88f7f259e03d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:83871d3b88f7f259e03d3c0d22d62a4bf0029ce9e96b083956e055b465c48876
+size 1103725
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31300_b30d557eb2d0bb1b8368.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31300_b30d557eb2d0bb1b8368.png
new file mode 100644
index 0000000000000000000000000000000000000000..8b9cfdeb9c1c20704c74a4b887905b474b95c3bb
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31300_b30d557eb2d0bb1b8368.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b30d557eb2d0bb1b8368364d1597c3ce7a176152616656ec08a10a53251f2dca
+size 996936
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31500_048434fb665ab6feac2b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31500_048434fb665ab6feac2b.png
new file mode 100644
index 0000000000000000000000000000000000000000..a98e770c2899b65414f9c05b6e4f6bfaf97be776
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31500_048434fb665ab6feac2b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:048434fb665ab6feac2b3ebe25f73f1804e23f22b2ad078fbf8faf5fd2cffda0
+size 548065
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31500_20358c7f3cab01e088fa.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31500_20358c7f3cab01e088fa.png
new file mode 100644
index 0000000000000000000000000000000000000000..d8449a5f84d562c996984380a1da1b0e7f2cbb13
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31500_20358c7f3cab01e088fa.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:20358c7f3cab01e088fade211c6cd9d4d4029a9643ccf93e9c59dda7ef4e5dde
+size 515823
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31500_43d8f0cccadcd97c6747.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31500_43d8f0cccadcd97c6747.png
new file mode 100644
index 0000000000000000000000000000000000000000..a100b0555adecb0aa61beabddb54fb3ab3f39f07
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31500_43d8f0cccadcd97c6747.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:43d8f0cccadcd97c6747037b6919b18330bc6bc746425d193c24c30683a08ca3
+size 707499
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31500_b688fc9eb0f7774c7465.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31500_b688fc9eb0f7774c7465.png
new file mode 100644
index 0000000000000000000000000000000000000000..f1e61e36c31f6418b761a1e95e8f02306e80e6e6
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31500_b688fc9eb0f7774c7465.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b688fc9eb0f7774c7465dde2bf13921524fab7cf87c42b48a8eed96d410a5b5f
+size 592305
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31700_285dff779b53cc89ff24.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31700_285dff779b53cc89ff24.png
new file mode 100644
index 0000000000000000000000000000000000000000..c659991a5ccee85eae4ab6b33fdec6f7cf85e3a4
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31700_285dff779b53cc89ff24.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:285dff779b53cc89ff2433d9d66240e708c1aef9a3c3914626b3c1120119fbf9
+size 300607
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31700_4d79ffa1fd7cab05b6cb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31700_4d79ffa1fd7cab05b6cb.png
new file mode 100644
index 0000000000000000000000000000000000000000..759d41bc8c2c981bc86e983b32193d4114196dd4
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31700_4d79ffa1fd7cab05b6cb.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4d79ffa1fd7cab05b6cba4ea659cbcbdc054ecb3beadb26e11fe3596fc421419
+size 654402
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31700_5f5b9faac84f1a1d59bd.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31700_5f5b9faac84f1a1d59bd.png
new file mode 100644
index 0000000000000000000000000000000000000000..c89cf73644e774dad03a175beef828e0cbf3465c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31700_5f5b9faac84f1a1d59bd.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5f5b9faac84f1a1d59bd2680b8cab0fa3ad4b1ab0c2d515d5894a20c468cbc38
+size 1208795
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31700_a9ad0e19e0d98fdc8de1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31700_a9ad0e19e0d98fdc8de1.png
new file mode 100644
index 0000000000000000000000000000000000000000..8b21d342e669eaff12581a7991ee12d500ed2728
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31700_a9ad0e19e0d98fdc8de1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a9ad0e19e0d98fdc8de1da2f66b3c72eee6910ce9158bb8ba5a7d5af2264d6e1
+size 761244
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31900_50bf17c58ae7a899a3a7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31900_50bf17c58ae7a899a3a7.png
new file mode 100644
index 0000000000000000000000000000000000000000..6661063014e7f0dc2e689a2e48008b4657f40e8c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31900_50bf17c58ae7a899a3a7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:50bf17c58ae7a899a3a7d229e0f9e2541b68e676a57e7ff7bad8cde6ab9ab61b
+size 596655
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31900_5f2f17d45978d648ce56.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31900_5f2f17d45978d648ce56.png
new file mode 100644
index 0000000000000000000000000000000000000000..dac95f186ba367853f19a12b3a7e365f774c4b03
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31900_5f2f17d45978d648ce56.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5f2f17d45978d648ce566ef2d9bf1d28438e36521abaf33d871a8540823c8fb3
+size 1045110
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31900_75632ab188ea4e9927e7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31900_75632ab188ea4e9927e7.png
new file mode 100644
index 0000000000000000000000000000000000000000..ea3914ac6781f9a2497fef5b67b43c7892eec7e8
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31900_75632ab188ea4e9927e7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:75632ab188ea4e9927e7eedda2a71bacfdc38ffad3aca149a16a7c2a7bae3dc4
+size 667203
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31900_cde313c724212d8a7c97.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31900_cde313c724212d8a7c97.png
new file mode 100644
index 0000000000000000000000000000000000000000..1d98cb04c5776a1b1a61145e1b41c33f6a9a5fc0
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_31900_cde313c724212d8a7c97.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:cde313c724212d8a7c97d29979ec9fc4f4000bcc769436a8bccb3289b4b461e1
+size 494039
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32100_042747563ed3d9b386bb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32100_042747563ed3d9b386bb.png
new file mode 100644
index 0000000000000000000000000000000000000000..33594cd7dfda42eed4e0082992b6f818d69059a3
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32100_042747563ed3d9b386bb.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:042747563ed3d9b386bbf2cafb206bd42b0123ba3d278a6dffa17cf401df7100
+size 568692
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32100_1b99500d80cb10708faf.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32100_1b99500d80cb10708faf.png
new file mode 100644
index 0000000000000000000000000000000000000000..f5272a8f2c537a7f77c6f8f76fa25cb5f4ac6b77
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32100_1b99500d80cb10708faf.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1b99500d80cb10708faf4d817e1d5dad18b7bd5dc61fe03af445795463e6967b
+size 1725380
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32100_a4ccca276e8ae240b191.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32100_a4ccca276e8ae240b191.png
new file mode 100644
index 0000000000000000000000000000000000000000..5dc0162f925069adb26a813f0182a71b773433ca
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32100_a4ccca276e8ae240b191.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a4ccca276e8ae240b191900620c498cd6624b82921b64f49602b627d9c8926d7
+size 688038
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32100_fd6c6c25fa0048dd973d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32100_fd6c6c25fa0048dd973d.png
new file mode 100644
index 0000000000000000000000000000000000000000..e960ba17e8e24a7b222d6b76f4861f2e0f0a58d9
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32100_fd6c6c25fa0048dd973d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:fd6c6c25fa0048dd973d95d0f0a8649fbc376907c77d50ea1292e307b6b32e86
+size 551227
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32300_82fad541ee4e516827d5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32300_82fad541ee4e516827d5.png
new file mode 100644
index 0000000000000000000000000000000000000000..8fb8d639fbccc36d831b41ea41e9d607c28e2579
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32300_82fad541ee4e516827d5.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:82fad541ee4e516827d56b2153e1e8d9eab9807ae7490ae9d94b154270e8b625
+size 743505
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32300_9ea49328477544575381.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32300_9ea49328477544575381.png
new file mode 100644
index 0000000000000000000000000000000000000000..743ed3b25205418b069ff2237a4cd486e5390a3b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32300_9ea49328477544575381.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9ea493284775445753816d0feee86bda44bb1ce9bc78e9e84d7f17d950f22a40
+size 825160
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32300_9ed90aa17207df349d17.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32300_9ed90aa17207df349d17.png
new file mode 100644
index 0000000000000000000000000000000000000000..1adfdc755badc6b866a9b2d88c7196dc93117271
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32300_9ed90aa17207df349d17.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9ed90aa17207df349d174af06b982dc2e7928ae4cbcedf4e3f6f65b76ef4fb01
+size 1176692
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32300_ddb59937592da775d402.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32300_ddb59937592da775d402.png
new file mode 100644
index 0000000000000000000000000000000000000000..5129f371eea693227ea9963aa9be7dd4ec714ca7
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32300_ddb59937592da775d402.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ddb59937592da775d4027ffe4bdaf3617cd83abc4e9e62bf8befafd63cd966b2
+size 769956
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32500_465d06c518a2c5d81517.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32500_465d06c518a2c5d81517.png
new file mode 100644
index 0000000000000000000000000000000000000000..e1b0a775024a1c3036aa24ebe581db632f5314f0
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32500_465d06c518a2c5d81517.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:465d06c518a2c5d815179be2a1a812cdc1e122c67d2394f11c9f61c1fd814b58
+size 1017013
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32500_465f9dab59468a80847b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32500_465f9dab59468a80847b.png
new file mode 100644
index 0000000000000000000000000000000000000000..7c8e5fc0da7212a0412e78f6d530999432a16567
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32500_465f9dab59468a80847b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:465f9dab59468a80847bfbd712701daea5d5b491e11c94b230f65c6d3f685dd1
+size 1039350
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32500_8ee7f325a412373b6fdc.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32500_8ee7f325a412373b6fdc.png
new file mode 100644
index 0000000000000000000000000000000000000000..4bb3d972de72cabda8f6ceb87aabca6b1ebc5002
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32500_8ee7f325a412373b6fdc.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8ee7f325a412373b6fdc62a9ced7aa5148c2f7435fa28dcc6cdc98af43064f31
+size 832275
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32500_e062c8bfa21e74367c53.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32500_e062c8bfa21e74367c53.png
new file mode 100644
index 0000000000000000000000000000000000000000..5a29699149b6155282dbe8454dd471fd8dbf5a99
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32500_e062c8bfa21e74367c53.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e062c8bfa21e74367c5308a33c027d5fca53d128255680be634b2e374c788a56
+size 721888
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32700_934aaad613b92ea44598.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32700_934aaad613b92ea44598.png
new file mode 100644
index 0000000000000000000000000000000000000000..fb4aa11f2d8f620d532893562fbe194b23399b1d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32700_934aaad613b92ea44598.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:934aaad613b92ea44598e03ba0fc220c726dc55e8f21fb76d3389f4ddf47508f
+size 304709
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32700_9c9e88ded6ef12a29161.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32700_9c9e88ded6ef12a29161.png
new file mode 100644
index 0000000000000000000000000000000000000000..7bcf490a6dd65a66d4022f881631b9193c114cdd
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32700_9c9e88ded6ef12a29161.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9c9e88ded6ef12a29161e6d3e14947304fda04e5c54a34b5155cffd1697d076d
+size 586974
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32700_ea41e1f1600872f1c377.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32700_ea41e1f1600872f1c377.png
new file mode 100644
index 0000000000000000000000000000000000000000..489c12d0d99ee47311f64e4826a2961c43381b0d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32700_ea41e1f1600872f1c377.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ea41e1f1600872f1c37786d35d15d5d587db3b5f4067cfa251eda0812600cb9c
+size 1052278
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32700_f1c857f72f64c16607ba.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32700_f1c857f72f64c16607ba.png
new file mode 100644
index 0000000000000000000000000000000000000000..04421846163c65fac3ac3e20e3a68b4df104959d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32700_f1c857f72f64c16607ba.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f1c857f72f64c16607ba46421adb88267594d2b20074c824baaec175039d90a2
+size 730668
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32900_0929c8d6b8da8203ecf0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32900_0929c8d6b8da8203ecf0.png
new file mode 100644
index 0000000000000000000000000000000000000000..c3fb07de87453adc23d1005e703d7da818bd8a80
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32900_0929c8d6b8da8203ecf0.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0929c8d6b8da8203ecf0f12288c66778294e3f4fd2f468510c9e5b67bbe764a4
+size 997835
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32900_1df428b38e7ad59e467e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32900_1df428b38e7ad59e467e.png
new file mode 100644
index 0000000000000000000000000000000000000000..687f7ee0d0add0b7d7bb0a2139ddcf65aacb7a24
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32900_1df428b38e7ad59e467e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1df428b38e7ad59e467e867f04a6a78f7205fbd123b3b121dabe784790a3066c
+size 708969
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32900_2fdfdae20f0bee734825.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32900_2fdfdae20f0bee734825.png
new file mode 100644
index 0000000000000000000000000000000000000000..30eb8f7deef7f9995c238a0e735c789c13bc61fb
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32900_2fdfdae20f0bee734825.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2fdfdae20f0bee734825d24f2fb43e3625f36534b2823d5ae4fd3205404d1c01
+size 1008163
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32900_e1b638835abb7e36dba5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32900_e1b638835abb7e36dba5.png
new file mode 100644
index 0000000000000000000000000000000000000000..6c299958d03c7c3ae7f41839c8cbbc4a5b3c3be0
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_32900_e1b638835abb7e36dba5.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e1b638835abb7e36dba5c5afdae817c7585d8a4e49276d72dcb79cfa90da2a5f
+size 886180
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3300_1692102175ccef9a3154.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3300_1692102175ccef9a3154.png
new file mode 100644
index 0000000000000000000000000000000000000000..93752f7283e44cfe6c215393309fbf2d61a1c3b7
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3300_1692102175ccef9a3154.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1692102175ccef9a31547641c3add7516af0433acbd1c773babf61cdacd87954
+size 645496
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3300_6e958d2ca6da92235d83.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3300_6e958d2ca6da92235d83.png
new file mode 100644
index 0000000000000000000000000000000000000000..a1f9ca2fe53cb4531b70e0c336d8445d83098f35
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3300_6e958d2ca6da92235d83.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6e958d2ca6da92235d833dffde029461570853fbd82c134b809c668071145c71
+size 723200
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3300_e4bac3c93cae1ef82fa7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3300_e4bac3c93cae1ef82fa7.png
new file mode 100644
index 0000000000000000000000000000000000000000..4bf8b4dcb30d7911fb5f9352b3fbf438c9e2dcaa
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3300_e4bac3c93cae1ef82fa7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e4bac3c93cae1ef82fa75da4c7e475c64d318fd639e93eb33dd8be3f51762068
+size 683905
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3300_fae7d0668953f48e3626.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3300_fae7d0668953f48e3626.png
new file mode 100644
index 0000000000000000000000000000000000000000..bbaf839f68f2cfab5fa741ff82d92386cd8eb5de
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3300_fae7d0668953f48e3626.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:fae7d0668953f48e3626d43e6541cd03502f57b613327ff680ed7c48123f29f2
+size 889626
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33100_12e6d8f92c77708096b3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33100_12e6d8f92c77708096b3.png
new file mode 100644
index 0000000000000000000000000000000000000000..7159912bbbbcb3f5368d876060795819e05b9ece
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33100_12e6d8f92c77708096b3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:12e6d8f92c77708096b351e4198b0f231b7b8792f82ecd52b410cdcb15a2a1dc
+size 628897
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33100_3a6f59d181f5129a452c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33100_3a6f59d181f5129a452c.png
new file mode 100644
index 0000000000000000000000000000000000000000..029d340b0283aa8332b166f28e31d612565332ba
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33100_3a6f59d181f5129a452c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3a6f59d181f5129a452c7e6892dbaaf2eb021a311bc5e60106ebd229bc9f19e1
+size 989226
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33100_a19f4f7467bc9d040c65.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33100_a19f4f7467bc9d040c65.png
new file mode 100644
index 0000000000000000000000000000000000000000..13c0ee8a7ed69436b859bdd13741c0b94f3bb41d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33100_a19f4f7467bc9d040c65.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a19f4f7467bc9d040c65db58f1b9b2a29852341a2b5da7e5c380d17640836c5d
+size 528092
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33100_a2366d9b7da0c0877ca2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33100_a2366d9b7da0c0877ca2.png
new file mode 100644
index 0000000000000000000000000000000000000000..2a65c08314fed439753652bccd7d45f43fc53da5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33100_a2366d9b7da0c0877ca2.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a2366d9b7da0c0877ca2a6fd5f415b4f7fd47c5ff69726c1246fe423ae60b752
+size 1205715
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33300_1ce7d8b2822cf3f3b487.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33300_1ce7d8b2822cf3f3b487.png
new file mode 100644
index 0000000000000000000000000000000000000000..9d98442d97969b987bfd3d8227b9ac102ba310fd
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33300_1ce7d8b2822cf3f3b487.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1ce7d8b2822cf3f3b48780bf01d51dfa63f9ecb2a5720f3203b7e695e9443b82
+size 1061918
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33300_27f38fe7f5c268498a05.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33300_27f38fe7f5c268498a05.png
new file mode 100644
index 0000000000000000000000000000000000000000..f4e4fa61bf594867ff2a0f097b8487d2474a6cb2
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33300_27f38fe7f5c268498a05.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:27f38fe7f5c268498a05560f86c444dd5ab32c0715a1185e919f500a1e44e84b
+size 405174
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33300_87168567909580e427c3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33300_87168567909580e427c3.png
new file mode 100644
index 0000000000000000000000000000000000000000..b69ee5202dbac802a57dd8eea8e47e1e80fafbcf
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33300_87168567909580e427c3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:87168567909580e427c3eae984f30495fcaf232236f414abfba16a6a4564ea05
+size 496342
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33300_f197961e1aec8c4574fb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33300_f197961e1aec8c4574fb.png
new file mode 100644
index 0000000000000000000000000000000000000000..ecb82f81da4edfbd71c7f2cb62ff949bb0ddfc73
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33300_f197961e1aec8c4574fb.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f197961e1aec8c4574fb0787e3b9544ffd2d4e2a91596d73fbbdcd0e1d62fab5
+size 626052
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33500_3c394cb5719d74b8f276.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33500_3c394cb5719d74b8f276.png
new file mode 100644
index 0000000000000000000000000000000000000000..e827cb3f7f213b7efd3b46afbebe2d3a6fa75330
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33500_3c394cb5719d74b8f276.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3c394cb5719d74b8f276bb8427bdfa10ca2fb4e0cc484ba131bc73ea8cac5876
+size 899433
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33500_75f5bfb11c53bd3c8548.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33500_75f5bfb11c53bd3c8548.png
new file mode 100644
index 0000000000000000000000000000000000000000..9697bc72a81f5402b30cab4c6f94c11082fe68a8
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33500_75f5bfb11c53bd3c8548.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:75f5bfb11c53bd3c85484c4e935faeaa6cd92405f39da70bdc665cda38a7d304
+size 487063
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33500_a4cdfe70c444861f7852.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33500_a4cdfe70c444861f7852.png
new file mode 100644
index 0000000000000000000000000000000000000000..299eca248fe6460775618bd82eeb9c41d60f3dea
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33500_a4cdfe70c444861f7852.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a4cdfe70c444861f7852908e96a9824c9bf37013446d2e97ab7b974a942c8333
+size 671767
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33500_ae1d6c5bd577e99f9bc7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33500_ae1d6c5bd577e99f9bc7.png
new file mode 100644
index 0000000000000000000000000000000000000000..d1578e13f44e46cf5a2f118c0ca0478459beaf87
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33500_ae1d6c5bd577e99f9bc7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ae1d6c5bd577e99f9bc7c32889de6c81d5fb7b05e4572d46ae0481b7acc8b35a
+size 688224
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33700_1c350d75c19c1e37142b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33700_1c350d75c19c1e37142b.png
new file mode 100644
index 0000000000000000000000000000000000000000..a413a93cd59955cd29bda1eb6d2e252259ce30b3
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33700_1c350d75c19c1e37142b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1c350d75c19c1e37142b1ab469a98c537d4833c0bf2177a1786c94312847bef6
+size 1180018
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33700_340ede0b026b6e41dce1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33700_340ede0b026b6e41dce1.png
new file mode 100644
index 0000000000000000000000000000000000000000..20f4523a688c37f805ede771f311efae0758f7dc
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33700_340ede0b026b6e41dce1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:340ede0b026b6e41dce1582d0408eee7439844e96029f8ba8d97ebe1e0b2b37e
+size 725348
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33700_8d1fbef30c529dbec276.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33700_8d1fbef30c529dbec276.png
new file mode 100644
index 0000000000000000000000000000000000000000..099d8c0ace7982a6f9c7e6a7ccd234fbb142be4f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33700_8d1fbef30c529dbec276.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8d1fbef30c529dbec276e49bf5ca084552fd0a9c0fe387fc0ba74a9bb9f16f4f
+size 591011
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33700_b20687af87f7455b45de.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33700_b20687af87f7455b45de.png
new file mode 100644
index 0000000000000000000000000000000000000000..b0dae0cae6511374b79421b802c330968926fd58
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33700_b20687af87f7455b45de.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b20687af87f7455b45de5174131cc23b4b9b285dc1ddb710043611a03245c090
+size 877619
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33900_13b1154773e69965b4f3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33900_13b1154773e69965b4f3.png
new file mode 100644
index 0000000000000000000000000000000000000000..a4e35c73b545b6e310bc64ea8b5010e8af229012
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33900_13b1154773e69965b4f3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:13b1154773e69965b4f3b7346274bf72dd41839911c871894e8f92a6b74e7a97
+size 460776
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33900_3d25f55c7f06f37d1477.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33900_3d25f55c7f06f37d1477.png
new file mode 100644
index 0000000000000000000000000000000000000000..0b41624162ccd6cd8f4222c03a53dc6bed5c564d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33900_3d25f55c7f06f37d1477.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3d25f55c7f06f37d147734db6efee3b67e25ce8780ee9caf328c062ebe432597
+size 1125157
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33900_72bc054432db275cbc30.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33900_72bc054432db275cbc30.png
new file mode 100644
index 0000000000000000000000000000000000000000..a85f72247fec060add94b7a983e798e235793d93
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33900_72bc054432db275cbc30.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:72bc054432db275cbc30b801d84f438f8ec6c2c28b0ea5fec249ee779a2a76e1
+size 617972
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33900_7ec9bec70d753290d5bb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33900_7ec9bec70d753290d5bb.png
new file mode 100644
index 0000000000000000000000000000000000000000..45ff5e84f3843a56a49f185c20c42eb24749c577
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_33900_7ec9bec70d753290d5bb.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7ec9bec70d753290d5bbcd1c7381c95ca7e1618bf1a48247ed34a8222a632af9
+size 697954
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34100_2dc5bb3c2dff04080922.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34100_2dc5bb3c2dff04080922.png
new file mode 100644
index 0000000000000000000000000000000000000000..eb978a2e88d8adfc61ea50296f99dcacc59855ff
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34100_2dc5bb3c2dff04080922.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2dc5bb3c2dff040809223a6a8e0d67f4117e64eca88699cbbf2ffc0b5db09b34
+size 195793
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34100_51ea20690ee1620ee3dc.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34100_51ea20690ee1620ee3dc.png
new file mode 100644
index 0000000000000000000000000000000000000000..6cf39dcaa53f511aef1aaad8e62bda88e486c182
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34100_51ea20690ee1620ee3dc.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:51ea20690ee1620ee3dc2bf32aeee5cf9dae661095e8ec902f8fe9ade0a32867
+size 450784
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34100_bb7c7c0fef151e82c08d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34100_bb7c7c0fef151e82c08d.png
new file mode 100644
index 0000000000000000000000000000000000000000..4329a3986f009fb92db62a0ec5036a1bb594ae09
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34100_bb7c7c0fef151e82c08d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:bb7c7c0fef151e82c08d68692d21b2a8df7957dcad3f7b256ac40b59d837c4a7
+size 1430471
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34100_f6d1ffd9611b1543562d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34100_f6d1ffd9611b1543562d.png
new file mode 100644
index 0000000000000000000000000000000000000000..ecca1a2ffbc638d455a46ba0452e8e304041e0fd
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34100_f6d1ffd9611b1543562d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f6d1ffd9611b1543562de4d55acba6cdfd7fd18d7cdcb0f1ad88eed7bdfe9599
+size 579639
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34300_7001bcb7e1eb94bcebeb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34300_7001bcb7e1eb94bcebeb.png
new file mode 100644
index 0000000000000000000000000000000000000000..f84605aebdee5d8bd8d738d4724277f95b00e71f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34300_7001bcb7e1eb94bcebeb.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7001bcb7e1eb94bcebebdcc29c2f7497c9fbcc75768c1c81ae1b8467c5c04597
+size 543808
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34300_784d3cff9dd2eff2ccee.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34300_784d3cff9dd2eff2ccee.png
new file mode 100644
index 0000000000000000000000000000000000000000..6d8ec8cc402483876946c44926d93f27e4fd87f1
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34300_784d3cff9dd2eff2ccee.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:784d3cff9dd2eff2ccee128dd7bd26a08de08ead07b09d701c41d52d756fd7a3
+size 396437
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34300_afa53d492e81815c6e16.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34300_afa53d492e81815c6e16.png
new file mode 100644
index 0000000000000000000000000000000000000000..8a1c85a2e6ad731b21afcd06452559906c1b67fb
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34300_afa53d492e81815c6e16.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:afa53d492e81815c6e161d3e6210ac565efca32588b608ca746962faa29ee8ad
+size 1787243
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34300_dce7b9f4c579256f99e7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34300_dce7b9f4c579256f99e7.png
new file mode 100644
index 0000000000000000000000000000000000000000..bb75f3625da9896e34016deccf9270d49f0d71d4
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34300_dce7b9f4c579256f99e7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:dce7b9f4c579256f99e72aee2e8569175b7816ca0582503e388973c028ac9ecf
+size 653726
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34500_21f04328572882e0e090.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34500_21f04328572882e0e090.png
new file mode 100644
index 0000000000000000000000000000000000000000..d20610f989b2bf9a4e9d3baa9c2216f033e9815f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34500_21f04328572882e0e090.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:21f04328572882e0e0900b9e569f03b6db5c46ad49d8e690a6045bfe2b5783e4
+size 466597
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34500_27617c6e8d03213fdc51.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34500_27617c6e8d03213fdc51.png
new file mode 100644
index 0000000000000000000000000000000000000000..82a6413aa801e81178fa052064f2ea05617a887f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34500_27617c6e8d03213fdc51.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:27617c6e8d03213fdc51c0e3709778609cf7b542aadfde031de3017014b2df91
+size 755395
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34500_7f08dfeca25978a339a6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34500_7f08dfeca25978a339a6.png
new file mode 100644
index 0000000000000000000000000000000000000000..98709a90ceaadc08b38ef0b263cb749cab18a3e4
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34500_7f08dfeca25978a339a6.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7f08dfeca25978a339a63ae96dbd7faddb2a2346d8a7afeb10dd41d9271b3163
+size 608820
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34500_b06950e7274d7d696369.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34500_b06950e7274d7d696369.png
new file mode 100644
index 0000000000000000000000000000000000000000..eb548adef3f44e1377ee398fbad46cf4aa4b8096
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34500_b06950e7274d7d696369.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b06950e7274d7d6963696cafbbed7bfe05ade20a71fc4308fed0901d5dceddc1
+size 1565222
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34700_b1e8fa26b21d3d1efee2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34700_b1e8fa26b21d3d1efee2.png
new file mode 100644
index 0000000000000000000000000000000000000000..1e74eb7143ba361d1fbc99a184ff12c1b97afba1
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34700_b1e8fa26b21d3d1efee2.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b1e8fa26b21d3d1efee2a851c8ced3ccce6bba35c0e03d0c3eefc36021f04f19
+size 446428
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34700_bd4545eff810eefa7dcd.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34700_bd4545eff810eefa7dcd.png
new file mode 100644
index 0000000000000000000000000000000000000000..0a83ec4dc9f4b1e6a97bcef0d4c2ac72af9cd424
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34700_bd4545eff810eefa7dcd.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:bd4545eff810eefa7dcd12ad2e7d2d8de5828e9ae7748abbc66e2e4d7f3a6c6e
+size 1687217
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34700_e2faa65bd6e611fdf8e0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34700_e2faa65bd6e611fdf8e0.png
new file mode 100644
index 0000000000000000000000000000000000000000..1258095182690e34ab5ed4b55900b02a1e32da99
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34700_e2faa65bd6e611fdf8e0.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e2faa65bd6e611fdf8e091fc2a6d5c0e1be27d848cb18c4501d906ef6431ae02
+size 618212
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34700_ebecbd7096e6b4df4996.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34700_ebecbd7096e6b4df4996.png
new file mode 100644
index 0000000000000000000000000000000000000000..5f482acd55a65ac053d3313de89cbedffbc10780
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34700_ebecbd7096e6b4df4996.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ebecbd7096e6b4df4996db9da735008b44326d8a1f3e131974e88d12d9b27a0c
+size 845105
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34900_5250b14455efd7d58d2a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34900_5250b14455efd7d58d2a.png
new file mode 100644
index 0000000000000000000000000000000000000000..cba9696d4cc5e6e5fab60dd7f08e9e650f942cde
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34900_5250b14455efd7d58d2a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5250b14455efd7d58d2ad011c21f794c17a4d7e6bfacda551a44a2d8f4e80a5b
+size 1460348
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34900_65ce9f1c044599be8add.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34900_65ce9f1c044599be8add.png
new file mode 100644
index 0000000000000000000000000000000000000000..d17b51bcabd7d6b07048b52194949931481e6436
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34900_65ce9f1c044599be8add.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:65ce9f1c044599be8add56678b358a1a49ada5685fd7a7080ff3fbf44739b7af
+size 594730
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34900_8cda4e3a0592dc125fa1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34900_8cda4e3a0592dc125fa1.png
new file mode 100644
index 0000000000000000000000000000000000000000..b26fb64acb679a734e6e4bd8c25d7aa8235a699e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34900_8cda4e3a0592dc125fa1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8cda4e3a0592dc125fa1a2431a5c8921cc1cae6dbce58170ec385e068c132f55
+size 558862
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34900_a7fcf85f148fe0423a33.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34900_a7fcf85f148fe0423a33.png
new file mode 100644
index 0000000000000000000000000000000000000000..2da7b498ed9ed8e0555fe7ec0a1a66b7398f6bdf
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_34900_a7fcf85f148fe0423a33.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a7fcf85f148fe0423a3362c29fe3c7b7d08e3456fe665a5ced20aab477cdd6e8
+size 495141
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3500_5a7c5a8971d59d426aa4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3500_5a7c5a8971d59d426aa4.png
new file mode 100644
index 0000000000000000000000000000000000000000..bc6a6a0d03fe93a8ee6d204be30da52e47e00439
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3500_5a7c5a8971d59d426aa4.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5a7c5a8971d59d426aa4e762d071b628131f5fb877079db30939e75890287bb1
+size 448246
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3500_796941b7e64e99dc4e2a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3500_796941b7e64e99dc4e2a.png
new file mode 100644
index 0000000000000000000000000000000000000000..9fc1a29b5c69f0fc3c4ff4ff99112dad5ca29611
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3500_796941b7e64e99dc4e2a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:796941b7e64e99dc4e2a89cdd217ec2a915dd1ad58b743dad2b7562b847c9d5c
+size 704184
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3500_ead6cc86e1e9ca08a7cc.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3500_ead6cc86e1e9ca08a7cc.png
new file mode 100644
index 0000000000000000000000000000000000000000..05c00294dcfd7200ac71a6106fd5fe7fdd8ce245
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3500_ead6cc86e1e9ca08a7cc.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ead6cc86e1e9ca08a7cca74eac29e5bf85c0e6b64b3b9f1db40a7a69b11dc126
+size 823220
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3500_eef1a09c4c2f76fd0ce3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3500_eef1a09c4c2f76fd0ce3.png
new file mode 100644
index 0000000000000000000000000000000000000000..7af43d84f61d9ddb46d70824c481d171e616b63d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3500_eef1a09c4c2f76fd0ce3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:eef1a09c4c2f76fd0ce302b383d8c65aa860c879c4094c02721b48122f189474
+size 664061
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35100_5f27c03752c935e91def.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35100_5f27c03752c935e91def.png
new file mode 100644
index 0000000000000000000000000000000000000000..2bfe2705c554c7b89f88f8a3bd72f6d4e04109c4
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35100_5f27c03752c935e91def.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5f27c03752c935e91deffad615309f15b57eb0d56285f3c363eed461e3f81f0d
+size 886978
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35100_9f39658c6d7d695328fe.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35100_9f39658c6d7d695328fe.png
new file mode 100644
index 0000000000000000000000000000000000000000..3bc053eafeb10d6f284c1cda7f2d17ba30d1dea8
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35100_9f39658c6d7d695328fe.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9f39658c6d7d695328feb408e4a4ea40608c91264f7d4fd2296950a0ce363289
+size 798642
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35100_b74ef22fbacc5d724ee3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35100_b74ef22fbacc5d724ee3.png
new file mode 100644
index 0000000000000000000000000000000000000000..ad5760cfd478b9a7822febee5fba8474b6db86bc
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35100_b74ef22fbacc5d724ee3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b74ef22fbacc5d724ee376e1bfe0628c9e6bb564fd36547e77141535c678d4d6
+size 1393439
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35100_e807a34da0ee156422ba.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35100_e807a34da0ee156422ba.png
new file mode 100644
index 0000000000000000000000000000000000000000..ae9414a7fedd02cda96e2eb880355a6bcfe60f8d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35100_e807a34da0ee156422ba.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e807a34da0ee156422ba77789f57a55fd6ddba838477645f1776cdfb1c2786e6
+size 854093
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35300_618657a419019bb57bb9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35300_618657a419019bb57bb9.png
new file mode 100644
index 0000000000000000000000000000000000000000..2c78fe5203f94a7810c9c5c617e68feff023aa2c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35300_618657a419019bb57bb9.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:618657a419019bb57bb93e885be9193fb0c01bff89862a4068ff008544153491
+size 601391
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35300_645c50b3066f968fbbbe.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35300_645c50b3066f968fbbbe.png
new file mode 100644
index 0000000000000000000000000000000000000000..ac319316e28d2124064418e11d643a59bdd71492
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35300_645c50b3066f968fbbbe.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:645c50b3066f968fbbbe771ab0f1018dd908d1696d32a2748bdfdf089ab472f9
+size 755750
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35300_b9a9ee61d23e3158f570.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35300_b9a9ee61d23e3158f570.png
new file mode 100644
index 0000000000000000000000000000000000000000..07ff914b4c57fc05e89307c1b74b7ceac03ab48c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35300_b9a9ee61d23e3158f570.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b9a9ee61d23e3158f5706dd7cc0f4df54aea8e76c42106c0de5a0663207265e2
+size 860695
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35300_f96c920e005dc9294a80.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35300_f96c920e005dc9294a80.png
new file mode 100644
index 0000000000000000000000000000000000000000..a1075a7c448b30a95ee5b4e7e7c0426a5d9c631a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35300_f96c920e005dc9294a80.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f96c920e005dc9294a80f6a917fa678ebfe2d9bedb0702fc53f6d78f261d0402
+size 1807469
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35500_431d06211bf6cc5a1b33.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35500_431d06211bf6cc5a1b33.png
new file mode 100644
index 0000000000000000000000000000000000000000..37eab077a300c02a3db5fbac2993658698343a0a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35500_431d06211bf6cc5a1b33.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:431d06211bf6cc5a1b33943854d5cfda02d74f10fb6995800e11c62a0408d747
+size 450549
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35500_74fda1343e4d9d7f775a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35500_74fda1343e4d9d7f775a.png
new file mode 100644
index 0000000000000000000000000000000000000000..78d1d9ba5d3208453d425b941b278a1ce833f72a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35500_74fda1343e4d9d7f775a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:74fda1343e4d9d7f775aff195cad2d017c03d0da34410ab04562b512604ad993
+size 1267936
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35500_b7cb1f4be7a4e0705e5a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35500_b7cb1f4be7a4e0705e5a.png
new file mode 100644
index 0000000000000000000000000000000000000000..618b4302d513b33aee9174603b82d4e91f563c53
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35500_b7cb1f4be7a4e0705e5a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b7cb1f4be7a4e0705e5a1c4604b03cf1ead8aee1c4fc01aedf53a883ba00f3b1
+size 1111159
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35500_ea6f850c730ee011f37a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35500_ea6f850c730ee011f37a.png
new file mode 100644
index 0000000000000000000000000000000000000000..777b3bb8ec6fb36b7e872626d4979f5ed65b663c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35500_ea6f850c730ee011f37a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ea6f850c730ee011f37ad45bf6e623714b74092521a1a00cfe3e0e20a8de68ff
+size 569563
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35700_3296b9faf4fb10ee83f8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35700_3296b9faf4fb10ee83f8.png
new file mode 100644
index 0000000000000000000000000000000000000000..b825b34776795d8d6ba4b491a3ef68aef5ad3bfb
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35700_3296b9faf4fb10ee83f8.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3296b9faf4fb10ee83f882a03ff14e7e365c2474d11c4caaa35ff39e6049419a
+size 800421
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35700_3f82fa916d90987da476.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35700_3f82fa916d90987da476.png
new file mode 100644
index 0000000000000000000000000000000000000000..af8033a4bdd6abc9fc5290f407db14bf6dca7f45
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35700_3f82fa916d90987da476.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3f82fa916d90987da476fc783fa2d7cdbcfdbfc2851714f737834e92ba3946d3
+size 662030
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35700_53ac95835f46fff119f1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35700_53ac95835f46fff119f1.png
new file mode 100644
index 0000000000000000000000000000000000000000..515d935836f78f7d62f8be6878dadb12e2f2ecd3
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35700_53ac95835f46fff119f1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:53ac95835f46fff119f1abd1a300c95335169e2e15bf59956ff277c1981fece7
+size 130089
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35700_d0e28a8a6cdd475c4c64.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35700_d0e28a8a6cdd475c4c64.png
new file mode 100644
index 0000000000000000000000000000000000000000..ea5b1c50dc5db8aa810db6b2ac9a224fab2a1366
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35700_d0e28a8a6cdd475c4c64.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d0e28a8a6cdd475c4c64d1043aba94aa7cfadd62e0e0434901f5c77212e62c87
+size 1709704
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35900_247d8c3029398281316a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35900_247d8c3029398281316a.png
new file mode 100644
index 0000000000000000000000000000000000000000..bd2cfd329fa7a86f3ae8064070e454841917fdc9
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35900_247d8c3029398281316a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:247d8c3029398281316a2ef79756d2da265ae887d314c628025c69bb412e1023
+size 326219
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35900_6e384872404fe4f20d7f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35900_6e384872404fe4f20d7f.png
new file mode 100644
index 0000000000000000000000000000000000000000..dcc349a95e3e4fedd84721626922f023aa84bb47
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35900_6e384872404fe4f20d7f.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6e384872404fe4f20d7fa9f31283a2210c72986e4d3217b8f80ad5029f67fc53
+size 1208103
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35900_83312411e6151fa30dee.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35900_83312411e6151fa30dee.png
new file mode 100644
index 0000000000000000000000000000000000000000..166245b1ba51be9a3f01810fde559b21fccaa67d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35900_83312411e6151fa30dee.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:83312411e6151fa30deed8bc2029389831eb5ccf1f15894913107300e3d47a11
+size 686664
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35900_c3f7548a657732a8d722.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35900_c3f7548a657732a8d722.png
new file mode 100644
index 0000000000000000000000000000000000000000..f95a96290fac25652fb67c62c8b9132132da19eb
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_35900_c3f7548a657732a8d722.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c3f7548a657732a8d722164d8d9d5f50cab38f03b8cd902f9fd2cfc4c5b7a29c
+size 901290
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36100_2a9600288e78bbb2afbf.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36100_2a9600288e78bbb2afbf.png
new file mode 100644
index 0000000000000000000000000000000000000000..45a3c88eb6df716021f38c89912313a442b27bcd
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36100_2a9600288e78bbb2afbf.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2a9600288e78bbb2afbf948e911a54f537625177da32c3a19826b110db62c6dc
+size 1441230
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36100_45d27c8e00f7090f6bfc.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36100_45d27c8e00f7090f6bfc.png
new file mode 100644
index 0000000000000000000000000000000000000000..dcc65692a47f267ac4d3dc49a625000ed8b40daf
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36100_45d27c8e00f7090f6bfc.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:45d27c8e00f7090f6bfcafe97fb2fd563fbb1a2cc8801009111ee97dcb181dc8
+size 540484
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36100_80a9a667a41533b17480.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36100_80a9a667a41533b17480.png
new file mode 100644
index 0000000000000000000000000000000000000000..5cbd20231308cca1c9c78e158906c88b0072846c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36100_80a9a667a41533b17480.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:80a9a667a41533b174808cdba7b6fb9ac01cc2fce54965829517241a2b8f4bad
+size 335890
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36100_84fa2ad5ce08e74207a6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36100_84fa2ad5ce08e74207a6.png
new file mode 100644
index 0000000000000000000000000000000000000000..98074a7c44855136a2525c9ca7260b2b701df4d5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36100_84fa2ad5ce08e74207a6.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:84fa2ad5ce08e74207a65a912eb6a69e7eae9695a602673b6eea505adb8bd732
+size 1031112
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36300_13de23e0892355f7d449.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36300_13de23e0892355f7d449.png
new file mode 100644
index 0000000000000000000000000000000000000000..d46608de6583993e25bafda6d482ee6ac5057ab1
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36300_13de23e0892355f7d449.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:13de23e0892355f7d449c88e0e1d1df43546dc3a629c37e74c447105ba162f6f
+size 1394901
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36300_70ce1f4a5f0e03bc6a9a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36300_70ce1f4a5f0e03bc6a9a.png
new file mode 100644
index 0000000000000000000000000000000000000000..28f69b6f9f01991560280db03eaf26a3558e4f30
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36300_70ce1f4a5f0e03bc6a9a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:70ce1f4a5f0e03bc6a9ab2a13a0e5fb754b1c226949df89015b7bc4e60d9e8e8
+size 1014571
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36300_7160d6e8c310ae51d94e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36300_7160d6e8c310ae51d94e.png
new file mode 100644
index 0000000000000000000000000000000000000000..2be765d31c8f06ac7d1de3d3cbef308c476b0e9f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36300_7160d6e8c310ae51d94e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7160d6e8c310ae51d94e4950d29979af75f5f0787514c5464c6659d016edee6c
+size 1430571
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36300_c2cd6ca27ce8a4edfab6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36300_c2cd6ca27ce8a4edfab6.png
new file mode 100644
index 0000000000000000000000000000000000000000..8078ce1f4f08bdc741e44952f4833bff17502e48
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36300_c2cd6ca27ce8a4edfab6.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c2cd6ca27ce8a4edfab6fa90f9fdf839e45b55f52aeea5a466c9746718930d00
+size 704505
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36500_5c711d6b7907c71a9fa3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36500_5c711d6b7907c71a9fa3.png
new file mode 100644
index 0000000000000000000000000000000000000000..c152343a63db0a68671b6d79168a375233a08203
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36500_5c711d6b7907c71a9fa3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5c711d6b7907c71a9fa363a97d4483de8910d45c61401ed6b7aebd668c37a7ff
+size 811216
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36500_61d18919acb9418c9bc5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36500_61d18919acb9418c9bc5.png
new file mode 100644
index 0000000000000000000000000000000000000000..4f2a6bfca7731cf906cbe2c1556025a92b70ea60
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36500_61d18919acb9418c9bc5.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:61d18919acb9418c9bc5e0c1bd3136f60a270288b774f8f1e430af2525c77b3a
+size 519556
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36500_c8dcf0f88826c9e77abb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36500_c8dcf0f88826c9e77abb.png
new file mode 100644
index 0000000000000000000000000000000000000000..783f1f0ff0ddeb8c04a0989b34a0427cdf02a155
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36500_c8dcf0f88826c9e77abb.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c8dcf0f88826c9e77abba051a0d2ed86567595c9a04c4ccdd106debb89868965
+size 961852
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36500_e3f9500b264d7f8b401f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36500_e3f9500b264d7f8b401f.png
new file mode 100644
index 0000000000000000000000000000000000000000..2b4bcd549514cbb8c69dc9c7e36d0af46615e539
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36500_e3f9500b264d7f8b401f.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e3f9500b264d7f8b401f031fc252812f53755f4ece5724f61253631942341723
+size 669955
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36700_29e52383b996d08da6bc.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36700_29e52383b996d08da6bc.png
new file mode 100644
index 0000000000000000000000000000000000000000..12763ba98489f7c271b682cd0e8240af962d19ba
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36700_29e52383b996d08da6bc.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:29e52383b996d08da6bc23b06ac1f6b61313b923dac35a61de08e6086156d86a
+size 525038
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36700_2ae953ab2b6799a056b0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36700_2ae953ab2b6799a056b0.png
new file mode 100644
index 0000000000000000000000000000000000000000..be5d1a78698dccd1382f5b5c763b64bd10c7a49d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36700_2ae953ab2b6799a056b0.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2ae953ab2b6799a056b0755216bea57f7448d37ccddb534b795a0612e6145223
+size 1050231
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36700_553b42f888d8097ec7c3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36700_553b42f888d8097ec7c3.png
new file mode 100644
index 0000000000000000000000000000000000000000..fefb2a7f844e5007434de363f9b411484c2a2ce6
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36700_553b42f888d8097ec7c3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:553b42f888d8097ec7c33b319d7627c9822b12cec5f5b272fa6c224a1d1853ca
+size 782618
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36700_6362027a912e94d1bdc5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36700_6362027a912e94d1bdc5.png
new file mode 100644
index 0000000000000000000000000000000000000000..e7725f087ba05a405b62af1b8aba44bbec25bdee
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36700_6362027a912e94d1bdc5.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6362027a912e94d1bdc50c577919e874af72a5c530a960c4960fbfc4bcbf60d9
+size 1217586
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36900_13fefa72c6f4f2670f25.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36900_13fefa72c6f4f2670f25.png
new file mode 100644
index 0000000000000000000000000000000000000000..949abdf1ef917d5655a56140efbab78c861f408f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36900_13fefa72c6f4f2670f25.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:13fefa72c6f4f2670f2535128e34ee682c19ecb886f03a3a2b0c6c09000ab0ac
+size 1423394
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36900_72316d44a08c7aa9a7e8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36900_72316d44a08c7aa9a7e8.png
new file mode 100644
index 0000000000000000000000000000000000000000..0b572c0624f85a93b5cb6ec4a31b3a0715b67901
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36900_72316d44a08c7aa9a7e8.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:72316d44a08c7aa9a7e8550590558bccae5154ae2165565624240da96caf52a2
+size 450579
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36900_c4c8051e13724d1abe68.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36900_c4c8051e13724d1abe68.png
new file mode 100644
index 0000000000000000000000000000000000000000..327c00503517d47abc4151fc0198fb636a7c306b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36900_c4c8051e13724d1abe68.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c4c8051e13724d1abe68c99d667f9d1524601d0cdc88892fd8c8c432a69765ab
+size 512163
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36900_f62523de1460b42f777b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36900_f62523de1460b42f777b.png
new file mode 100644
index 0000000000000000000000000000000000000000..6465419c9569c18f0a68fc5f0cb2b1cf21425e20
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_36900_f62523de1460b42f777b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f62523de1460b42f777b255e4d80edcf3704150357ef5c4d9fcd427399b2d844
+size 867507
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3700_375b9fe31ca3d046548f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3700_375b9fe31ca3d046548f.png
new file mode 100644
index 0000000000000000000000000000000000000000..6794a86d44ea77d98531c841ba18f19c59e072a6
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3700_375b9fe31ca3d046548f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3700_8905a847bbc4918d518c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3700_8905a847bbc4918d518c.png
new file mode 100644
index 0000000000000000000000000000000000000000..659739a523ca90e10fa3b2c75a50f284ba8c19d7
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3700_8905a847bbc4918d518c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8905a847bbc4918d518c985b05b9503c8ca7f5cd43a5b0b3bdac98d0e7964bca
+size 784623
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3700_bb38d87fb8c50823f536.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3700_bb38d87fb8c50823f536.png
new file mode 100644
index 0000000000000000000000000000000000000000..eac01c8e73b8f2e5e3569e9520ca0ccee46995eb
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3700_bb38d87fb8c50823f536.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:bb38d87fb8c50823f536909db2fa5d203992400968dec02dfc115d71159abd5d
+size 438989
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3700_d0b474fc559c694a5c74.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3700_d0b474fc559c694a5c74.png
new file mode 100644
index 0000000000000000000000000000000000000000..12c3495f7d9451a2163cd1a87621cdc93b3d58bb
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3700_d0b474fc559c694a5c74.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d0b474fc559c694a5c74a43ef784cbc3b195f71b26728504041ff4a0fb849f9e
+size 1211597
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37100_1e71cc4c0780a0804f05.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37100_1e71cc4c0780a0804f05.png
new file mode 100644
index 0000000000000000000000000000000000000000..04e51bb8020b3963702feec6cbf2f8bf479547df
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37100_1e71cc4c0780a0804f05.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1e71cc4c0780a0804f05120e9593c7db3d5e5315fe51b7feadaefa64bf3fb7a4
+size 545418
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37100_350f6cfc6b677dc7fd2f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37100_350f6cfc6b677dc7fd2f.png
new file mode 100644
index 0000000000000000000000000000000000000000..2037f824559f36d8403c5ef9fdc2c5df780a1014
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37100_350f6cfc6b677dc7fd2f.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:350f6cfc6b677dc7fd2f771da20ca70cc353d7f509fb5682c7220cc69ba6bc67
+size 844606
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37100_8ce940ae46e0b635c7f3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37100_8ce940ae46e0b635c7f3.png
new file mode 100644
index 0000000000000000000000000000000000000000..8e8b504ddd5da4c6ca97dc3497e31be280373568
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37100_8ce940ae46e0b635c7f3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8ce940ae46e0b635c7f3ec32a5e6d549127f4fcffa29a0dc6c40bdc02e7595eb
+size 186638
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37100_9c39d4f187db01e96117.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37100_9c39d4f187db01e96117.png
new file mode 100644
index 0000000000000000000000000000000000000000..1d84de517cc95d0ae03260d6a4862d06fd1eb63d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37100_9c39d4f187db01e96117.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9c39d4f187db01e961170f3c7c73994a6263681a831183836a9d7bac21a5ae7c
+size 815852
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37300_1b06f4008b8e5991e34d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37300_1b06f4008b8e5991e34d.png
new file mode 100644
index 0000000000000000000000000000000000000000..bea466466f663d79bf5cd3418a0ecd72fd9f0846
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37300_1b06f4008b8e5991e34d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1b06f4008b8e5991e34d9a266132d652646e6ff8e0166da653ddca2e50e525d1
+size 472069
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37300_7981dda154a81d13e444.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37300_7981dda154a81d13e444.png
new file mode 100644
index 0000000000000000000000000000000000000000..41c9dfba45956edc71572e9aba274b6affab5a81
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37300_7981dda154a81d13e444.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7981dda154a81d13e44443b0d4d51607db8158eb1b388d979b4bced62ab7586b
+size 1145106
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37300_99605c038f6b6e484dde.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37300_99605c038f6b6e484dde.png
new file mode 100644
index 0000000000000000000000000000000000000000..e7842e6483f25726def38f3e3b3a6c5b2c2db917
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37300_99605c038f6b6e484dde.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:99605c038f6b6e484dde1635bf51b0b15826467c03e8d2d057f8dc6dc107070d
+size 770237
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37300_fc529bd50e0c3028faaa.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37300_fc529bd50e0c3028faaa.png
new file mode 100644
index 0000000000000000000000000000000000000000..e09d24cb900b93989bc5fe855443301d3f79de4e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37300_fc529bd50e0c3028faaa.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:fc529bd50e0c3028faaa0889075ddfb149d689c977340fa6ebb37649d4bc13c3
+size 132250
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37500_0e3a5ff2fc029bc73aa1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37500_0e3a5ff2fc029bc73aa1.png
new file mode 100644
index 0000000000000000000000000000000000000000..54a5182948c3f5f52f82ad14f93509742075aa54
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37500_0e3a5ff2fc029bc73aa1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0e3a5ff2fc029bc73aa153f0bd414fee9bb575d97349c288efe140d15827d9ce
+size 1105840
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37500_1e4d5a01a5d90b30a9ca.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37500_1e4d5a01a5d90b30a9ca.png
new file mode 100644
index 0000000000000000000000000000000000000000..bb60e969068f4afe80db6ed94cfb581a965ee379
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37500_1e4d5a01a5d90b30a9ca.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1e4d5a01a5d90b30a9cae2061b2b8ae420cd5263a7b17ac1853b7237477be9bb
+size 326191
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37500_2e71eedd892b0dd2ecbe.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37500_2e71eedd892b0dd2ecbe.png
new file mode 100644
index 0000000000000000000000000000000000000000..6ec816d63ab8ee566b6c131a5a52035dfbd4bdff
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37500_2e71eedd892b0dd2ecbe.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2e71eedd892b0dd2ecbe176c0381d0f4004c50040f4048c10b5af563733c0385
+size 619464
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37500_872a5eb563c2030af46f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37500_872a5eb563c2030af46f.png
new file mode 100644
index 0000000000000000000000000000000000000000..16ddf4622d1f13c6e003f7a9625ac40555617c6f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37500_872a5eb563c2030af46f.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:872a5eb563c2030af46f5340e08b2a322600c8fd1aabf94948400898d8d3dc63
+size 1904083
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37700_2760ea14d3f07e47f083.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37700_2760ea14d3f07e47f083.png
new file mode 100644
index 0000000000000000000000000000000000000000..3ad6f29c94a74f172ca8f05eb6381d6634838f4f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37700_2760ea14d3f07e47f083.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2760ea14d3f07e47f0839ee9d67c6df663d4ef8b1ef06a510952b6ad77021c02
+size 884385
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37700_3606c3af68412068ac4f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37700_3606c3af68412068ac4f.png
new file mode 100644
index 0000000000000000000000000000000000000000..fd065e73b5d97fc236e4b7299d90b4f18a22d441
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37700_3606c3af68412068ac4f.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3606c3af68412068ac4fd023ea53179f8c698adcc868baaf6f28edc3e52cb35e
+size 487553
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37700_bc71976fbcc96b304393.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37700_bc71976fbcc96b304393.png
new file mode 100644
index 0000000000000000000000000000000000000000..c4b99f291c8821dd80d575fa3b904eb0163ca46c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37700_bc71976fbcc96b304393.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:bc71976fbcc96b30439387f842dd7ddd4b7680754f8d39b1117815df4ccfb4c1
+size 1615140
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37700_d204023285efcbab37a1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37700_d204023285efcbab37a1.png
new file mode 100644
index 0000000000000000000000000000000000000000..4c1d82b613546ea0d532990446adea8572b52df2
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37700_d204023285efcbab37a1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d204023285efcbab37a1fa35755cc571fd5940fffc44773f7415095385518356
+size 633255
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37900_8ac172f422b4c4371f43.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37900_8ac172f422b4c4371f43.png
new file mode 100644
index 0000000000000000000000000000000000000000..485a92faaa9d4469ad1b80521afc009a4411304e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37900_8ac172f422b4c4371f43.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8ac172f422b4c4371f434255cb450b64b858daab0d7dbcff1cb7db93d387ea2b
+size 560948
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37900_ca51298b8bf31add1804.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37900_ca51298b8bf31add1804.png
new file mode 100644
index 0000000000000000000000000000000000000000..64dcd41af9a3599b717cbd82096cd336a801696d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37900_ca51298b8bf31add1804.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ca51298b8bf31add1804c6deab0e578c99c946e4d56816648790e1420c9812d2
+size 1120620
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37900_cd6d5120cb62a7c99034.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37900_cd6d5120cb62a7c99034.png
new file mode 100644
index 0000000000000000000000000000000000000000..9d5ac6d9c97eb0fc1b7b9b16ab78e06522e6b08c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37900_cd6d5120cb62a7c99034.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:cd6d5120cb62a7c99034ad6ec06502a330db29eabf7eb2a559ead6ebe74d7ea2
+size 579631
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37900_ffd6aa3e3a02141673e2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37900_ffd6aa3e3a02141673e2.png
new file mode 100644
index 0000000000000000000000000000000000000000..e9f72315eee576136c03ecf56abc3334364b9bc5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_37900_ffd6aa3e3a02141673e2.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ffd6aa3e3a02141673e2862fae9884b2027ec17fc34e36447997fdc6943b0ade
+size 508493
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3900_145e786812a3fa570c81.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3900_145e786812a3fa570c81.png
new file mode 100644
index 0000000000000000000000000000000000000000..46ed9e094e39688dc1268ce20a6698e1a05947f0
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3900_145e786812a3fa570c81.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:145e786812a3fa570c8189414f84e3f0815e5406c138a87f61ae23eb2cd2fb5e
+size 806923
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3900_70e33ce49fd5128cab27.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3900_70e33ce49fd5128cab27.png
new file mode 100644
index 0000000000000000000000000000000000000000..c7bcceca1dc30ed6948fd3e0f345d0a96883c767
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3900_70e33ce49fd5128cab27.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:70e33ce49fd5128cab27ae29a1b201e0fbf651f829b76c23575807e93e5e5fa4
+size 1180667
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3900_cb1dcab9f37512acc49e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3900_cb1dcab9f37512acc49e.png
new file mode 100644
index 0000000000000000000000000000000000000000..5e25266750a5d3c6e5f2cec9a60a1aa41eb2e798
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3900_cb1dcab9f37512acc49e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:cb1dcab9f37512acc49e8af7bcf52d94fd078ea8b4e3d2378f21929a0e26e518
+size 815598
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3900_f077bf2631c7fefb2b9f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3900_f077bf2631c7fefb2b9f.png
new file mode 100644
index 0000000000000000000000000000000000000000..e10c8b8572e250b118a7406ad9f10df2348b096f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_3900_f077bf2631c7fefb2b9f.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f077bf2631c7fefb2b9f6bbc0618f29f1ed4e0b29a26ada0224eb3d88ee8f623
+size 652470
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4100_254b323f919b2fe4c800.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4100_254b323f919b2fe4c800.png
new file mode 100644
index 0000000000000000000000000000000000000000..a86292fb1d34c7e3ea83f2dca042205c0fdc2207
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4100_254b323f919b2fe4c800.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:254b323f919b2fe4c80011cafb353195025bfbe23034bf8440449f1c292bf194
+size 1103319
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4100_9ee5238b4d2969ee45e9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4100_9ee5238b4d2969ee45e9.png
new file mode 100644
index 0000000000000000000000000000000000000000..72975ee8c32cc4c206db638acb62f06781938452
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4100_9ee5238b4d2969ee45e9.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9ee5238b4d2969ee45e9474eed8d59c9aef323316fd458836ed5e24b2e25821c
+size 677818
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4100_be15b8af24f349147263.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4100_be15b8af24f349147263.png
new file mode 100644
index 0000000000000000000000000000000000000000..d2c549f735151e65dbcf8d503b073d5a828aa1b6
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4100_be15b8af24f349147263.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:be15b8af24f3491472639b70c017aa6e1f8f97be184281d56e09768656ae6701
+size 1369617
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4100_ddf4f84a6460695dca90.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4100_ddf4f84a6460695dca90.png
new file mode 100644
index 0000000000000000000000000000000000000000..47e407a6aad0a1a949a2175a1b4a8c15c311d2d0
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4100_ddf4f84a6460695dca90.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ddf4f84a6460695dca90582ea69797cd8e2d0af74b2f60c7bd606851ea0f5c3d
+size 769524
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4300_3fe66a76167d66407ad4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4300_3fe66a76167d66407ad4.png
new file mode 100644
index 0000000000000000000000000000000000000000..75361d301ff2778604a1091bb7d528bc2109a5a6
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4300_3fe66a76167d66407ad4.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3fe66a76167d66407ad4e0ea989c0496d615f14f4a06686aa105bf15dd6d8c0b
+size 683719
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4300_5e7fa134a49075f9a8e0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4300_5e7fa134a49075f9a8e0.png
new file mode 100644
index 0000000000000000000000000000000000000000..ef165ddbe852529cc85db683370a5042bb232df5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4300_5e7fa134a49075f9a8e0.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5e7fa134a49075f9a8e06d1cc9905344a5986bcac3460133837828eb27480452
+size 894357
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4300_67dbf9422673d51a834b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4300_67dbf9422673d51a834b.png
new file mode 100644
index 0000000000000000000000000000000000000000..3e12389ce38dee7253443a62d5d55a6e6740b845
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4300_67dbf9422673d51a834b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:67dbf9422673d51a834be64b11e05c754bba59d4cd553492f9c4efc685c4ad4c
+size 631922
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4300_bce60da0003554d8d006.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4300_bce60da0003554d8d006.png
new file mode 100644
index 0000000000000000000000000000000000000000..dd2d8d286c6332619e3eba3ac970d7415e505aa7
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4300_bce60da0003554d8d006.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:bce60da0003554d8d00699683b28e9262ec57498bce339dc87f75f149d59ce70
+size 643689
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4500_2099bc312a5ca66353f5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4500_2099bc312a5ca66353f5.png
new file mode 100644
index 0000000000000000000000000000000000000000..8ac5796fc686b4bc3837cb50a8dbc75ba4cae75c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4500_2099bc312a5ca66353f5.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2099bc312a5ca66353f55165866fd263e0cef1c86184ccce7fe04d76a964a413
+size 665610
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4500_3558adf2133e177144bf.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4500_3558adf2133e177144bf.png
new file mode 100644
index 0000000000000000000000000000000000000000..a4ea96dfff317cf6d374090951cc1889ede593a2
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4500_3558adf2133e177144bf.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3558adf2133e177144bff5f3b68e119c8aa28a642d271ed9d8c630b517b60c60
+size 1121591
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4500_9e91d87704478863e7dc.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4500_9e91d87704478863e7dc.png
new file mode 100644
index 0000000000000000000000000000000000000000..ff79d4664dd8f640e4e0e78a1f065fa26e8603f7
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4500_9e91d87704478863e7dc.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9e91d87704478863e7dca885270c05a651dbc4a97d3a769135b85982a0117018
+size 669083
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4500_f8af08afb97bad6f894c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4500_f8af08afb97bad6f894c.png
new file mode 100644
index 0000000000000000000000000000000000000000..5c304c0930f7e2876e1242253196422c0a617c8d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4500_f8af08afb97bad6f894c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f8af08afb97bad6f894cdec4ac4cf0b89e53a541af11b3a8a99dd359f48e58db
+size 431749
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4700_429fd7fc23f3c008bf50.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4700_429fd7fc23f3c008bf50.png
new file mode 100644
index 0000000000000000000000000000000000000000..5e5e54847ad8ca4ca5eb87f5a8063312103ba4f3
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4700_429fd7fc23f3c008bf50.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:429fd7fc23f3c008bf502a8ab51038f6fab01f9706578a0ee82a0176bb81d03e
+size 545340
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4700_99b989297e32289b7a53.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4700_99b989297e32289b7a53.png
new file mode 100644
index 0000000000000000000000000000000000000000..7d8af1cecb3df79f1ecb329b9956df522375216e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4700_99b989297e32289b7a53.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:99b989297e32289b7a5327fb7ded20e49c2036bd79aa300d7dc0341fcba467c1
+size 834152
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4700_9f93122bf39a6ceb64bd.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4700_9f93122bf39a6ceb64bd.png
new file mode 100644
index 0000000000000000000000000000000000000000..d29198097ece7f3f018641873f253a143b536e6f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4700_9f93122bf39a6ceb64bd.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9f93122bf39a6ceb64bd0d0e2fa9678124f820e412ff081d3cb6836e581334c6
+size 762205
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4700_d35eedb5643705f01394.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4700_d35eedb5643705f01394.png
new file mode 100644
index 0000000000000000000000000000000000000000..ceefdcc27a748049cc0951695100f84b83e76b34
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4700_d35eedb5643705f01394.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d35eedb5643705f01394d0bc658709591b9f06079550d308cf12fa62cb3008c7
+size 573129
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4900_43f157382c3d92ebd1a1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4900_43f157382c3d92ebd1a1.png
new file mode 100644
index 0000000000000000000000000000000000000000..cdff42384438584a279ca21acb7db85f7643e3da
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4900_43f157382c3d92ebd1a1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:43f157382c3d92ebd1a130c2cfa2795839b3c1b2ee34e3dd729121ace2a4f830
+size 404396
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4900_4cdeaddfb8aa662c8e34.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4900_4cdeaddfb8aa662c8e34.png
new file mode 100644
index 0000000000000000000000000000000000000000..9e37cd62f145474a5b16a969a21c8ed8485befbd
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4900_4cdeaddfb8aa662c8e34.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4cdeaddfb8aa662c8e347dd2d87deb803965150b42e36ea0828a66f25b3554fa
+size 889432
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4900_c6fe544fd3c8d8c3a647.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4900_c6fe544fd3c8d8c3a647.png
new file mode 100644
index 0000000000000000000000000000000000000000..c3345cc107c10af5246c66e17a0b7bbfb153dca4
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4900_c6fe544fd3c8d8c3a647.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c6fe544fd3c8d8c3a6478e4a4ae649bf6f19647a9dfda773c409d83923b65b17
+size 877428
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4900_ff795ecb2e94ba5dbc53.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4900_ff795ecb2e94ba5dbc53.png
new file mode 100644
index 0000000000000000000000000000000000000000..235a05356e7452e0a241c767df8403ca9fe85d8c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_4900_ff795ecb2e94ba5dbc53.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ff795ecb2e94ba5dbc5338e2493d8bdc24091dddcbed5dfd3e086dc545d2d24e
+size 418562
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_500_16f4d5636653876415bb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_500_16f4d5636653876415bb.png
new file mode 100644
index 0000000000000000000000000000000000000000..de007a55c99928003aa9f417198b3630aff8fdb0
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_500_16f4d5636653876415bb.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:16f4d5636653876415bb3dfcb5b0ccab6e7b14858fa0fcc2bb0f6276afcd35d1
+size 1062542
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_500_456058aaadffe0ff58da.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_500_456058aaadffe0ff58da.png
new file mode 100644
index 0000000000000000000000000000000000000000..245a6004020d3ed8eb28b0665318b89815a3a536
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_500_456058aaadffe0ff58da.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:456058aaadffe0ff58da98992e8784246b5e44fdf1f87d56a6fdac66ed2e6b2c
+size 1026252
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_500_ab59d130faadcf7403b4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_500_ab59d130faadcf7403b4.png
new file mode 100644
index 0000000000000000000000000000000000000000..a24301e3726c61ea2d21927df69149f3a0977c2e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_500_ab59d130faadcf7403b4.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ab59d130faadcf7403b4448b18631d192013ada7084e72efe82e37a44f63fffb
+size 1059602
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_500_eea67e654393bec159b8.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_500_eea67e654393bec159b8.png
new file mode 100644
index 0000000000000000000000000000000000000000..2f6d84cc0583e1bc94b30af18953ffd7078ef4c8
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_500_eea67e654393bec159b8.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:eea67e654393bec159b8040814e150f6ab27056b565a06eaf304c6bc9c1694f5
+size 1025715
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5100_13f773cb7769a700eab5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5100_13f773cb7769a700eab5.png
new file mode 100644
index 0000000000000000000000000000000000000000..aa843c9e45aeb6a598577499467769488f6a0de7
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5100_13f773cb7769a700eab5.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:13f773cb7769a700eab51731ca9323fab1aa3be05b59e22f9eaf8381192bf56e
+size 597194
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5100_30ec53b33930d9b32982.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5100_30ec53b33930d9b32982.png
new file mode 100644
index 0000000000000000000000000000000000000000..ca93cbd30901efc09a0c05b155c796652457b44b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5100_30ec53b33930d9b32982.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:30ec53b33930d9b3298249497d70143ac6032d1289518831b2a81640bb224cff
+size 955193
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5100_c876b4b80693476e929e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5100_c876b4b80693476e929e.png
new file mode 100644
index 0000000000000000000000000000000000000000..e6df3fe1e4830b78a5cc3d4eb2fb5f057b928f4b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5100_c876b4b80693476e929e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c876b4b80693476e929e96bdf1f5d88353aa9280c0ae8eb52bd8fc1a020f27db
+size 650278
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5100_df4ddd65910de5da371c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5100_df4ddd65910de5da371c.png
new file mode 100644
index 0000000000000000000000000000000000000000..8c2cf45569a02868ad2d39054ef0eaba4aed9975
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5100_df4ddd65910de5da371c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:df4ddd65910de5da371c8a625611d80fa1a3e07b246e929147a31f676d7e03b0
+size 1457728
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5300_71c29611661df2951f9f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5300_71c29611661df2951f9f.png
new file mode 100644
index 0000000000000000000000000000000000000000..21a8ed85454ee0b21f42870cfb69ab60ffb0b80b
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5300_71c29611661df2951f9f.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:71c29611661df2951f9f267e37ea941f600448417c04c2e5be362910d7b842e1
+size 516837
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5300_9a988f1e5e45c58df5da.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5300_9a988f1e5e45c58df5da.png
new file mode 100644
index 0000000000000000000000000000000000000000..74a87556d00c529c5e06580679e3dadf66770c70
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5300_9a988f1e5e45c58df5da.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9a988f1e5e45c58df5da49fd27b3e2da90e463f02119ee18a0085e020da50dee
+size 841635
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5300_b64d4ae9873eafaf957e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5300_b64d4ae9873eafaf957e.png
new file mode 100644
index 0000000000000000000000000000000000000000..584ee98fbd0b6243ca7b805a5a0963bdd6a4f556
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5300_b64d4ae9873eafaf957e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b64d4ae9873eafaf957e80fd23ae9f07ea37afa9eb616183205a71b2d5a39470
+size 516775
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5300_d950a5111e1ece1f2eb4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5300_d950a5111e1ece1f2eb4.png
new file mode 100644
index 0000000000000000000000000000000000000000..79504631afe706f9c9a523b651f13dc6ff49ca5c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5300_d950a5111e1ece1f2eb4.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d950a5111e1ece1f2eb407f799ae00d465a6e42c97c6c27f8711104cf12e812b
+size 824327
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5500_3200d674c2eef67a647b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5500_3200d674c2eef67a647b.png
new file mode 100644
index 0000000000000000000000000000000000000000..5760b1b8fc0e6052149cda2294acc7a3655d4747
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5500_3200d674c2eef67a647b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3200d674c2eef67a647bdf3e09f1849c5d5ac2cf71831a6cf72dcbeda12a3bff
+size 995073
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5500_7e99affd9afd34ae44b2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5500_7e99affd9afd34ae44b2.png
new file mode 100644
index 0000000000000000000000000000000000000000..eb8cefde430a1f8b95f3f00b91bab60cc3ee8b27
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5500_7e99affd9afd34ae44b2.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7e99affd9afd34ae44b233087c98aca3481cb2daadbaffc9235c534d6e560e07
+size 660368
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5500_b5b7eada0bd7159b6df5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5500_b5b7eada0bd7159b6df5.png
new file mode 100644
index 0000000000000000000000000000000000000000..a283b80eedbbf307b4ede7fb159cd955ed44ffb3
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5500_b5b7eada0bd7159b6df5.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b5b7eada0bd7159b6df5fc2ca5c71fdbec3760f836f646610f3da97bf32dab3a
+size 573219
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5500_e7c53df4e150300345ed.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5500_e7c53df4e150300345ed.png
new file mode 100644
index 0000000000000000000000000000000000000000..a1ce703b7bdb7f2c00289a300886027f7c9dbfe3
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5500_e7c53df4e150300345ed.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e7c53df4e150300345ed1e77c4f96492afc33610c8195cffa98fb0b6c870a11f
+size 881757
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5700_6eb0d0e95ac285b30859.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5700_6eb0d0e95ac285b30859.png
new file mode 100644
index 0000000000000000000000000000000000000000..4e6cb88c7c50f7451e73cba700c2983584668e14
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5700_6eb0d0e95ac285b30859.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6eb0d0e95ac285b308597680e05948491591fd497b4c2a7cdd6976d4a7a9b673
+size 575634
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5700_8b6d96d4d9e7f5d61dc0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5700_8b6d96d4d9e7f5d61dc0.png
new file mode 100644
index 0000000000000000000000000000000000000000..bad30a428e881d1e7ed178154c4a95f8b237300d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5700_8b6d96d4d9e7f5d61dc0.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8b6d96d4d9e7f5d61dc022e7d6da68610d2e30c0cf2bdd5b68305334566a640a
+size 1112750
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5700_98c3e5e8bfa3a35d3dbd.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5700_98c3e5e8bfa3a35d3dbd.png
new file mode 100644
index 0000000000000000000000000000000000000000..66493c7de71b92e6da30e1eabdd7b5b2922a1af4
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5700_98c3e5e8bfa3a35d3dbd.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:98c3e5e8bfa3a35d3dbd331ab6a4be8624ba87e65a15db73c362a6167de3f6e6
+size 597030
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5700_c30e8343923b91a5e06a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5700_c30e8343923b91a5e06a.png
new file mode 100644
index 0000000000000000000000000000000000000000..23119720f54e3376433a2c4712c7ea7aa03d9248
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5700_c30e8343923b91a5e06a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c30e8343923b91a5e06ab5a89169749d35a6d72e6a62a94fe21340f3fb4aef16
+size 1030706
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5900_8078623ed731cff4be72.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5900_8078623ed731cff4be72.png
new file mode 100644
index 0000000000000000000000000000000000000000..26b61c2c38cc698cf3f7f277d27b7683b5f68cde
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5900_8078623ed731cff4be72.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5900_9fc75ae87c6b8206684f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5900_9fc75ae87c6b8206684f.png
new file mode 100644
index 0000000000000000000000000000000000000000..d39ccff43fd6dbb633a31b0edb14f11c74750842
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5900_9fc75ae87c6b8206684f.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5900_a7a2068e3865d655aa08.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5900_a7a2068e3865d655aa08.png
new file mode 100644
index 0000000000000000000000000000000000000000..d8f9eeaa14882613a6c977dede052fe2dff5661d
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5900_a7a2068e3865d655aa08.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5900_aa7df6d7c3d66e1824bd.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5900_aa7df6d7c3d66e1824bd.png
new file mode 100644
index 0000000000000000000000000000000000000000..45bc28d02b1a466f4c9f5cbca7758dfd39278561
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_5900_aa7df6d7c3d66e1824bd.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:aa7df6d7c3d66e1824bd4f6152dcaf708cf4e38ee8a8da2518ae39e5d61f3a60
+size 843914
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6100_55a27a44284de1bdf443.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6100_55a27a44284de1bdf443.png
new file mode 100644
index 0000000000000000000000000000000000000000..abc485a1184cfe9b6bfc735e041f6eb098f5cefc
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6100_55a27a44284de1bdf443.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:55a27a44284de1bdf443ac0953cba7b47d12f4c18d22950b1f0aaf8ac10c27f5
+size 769882
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6100_5ace67b39a4a62859974.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6100_5ace67b39a4a62859974.png
new file mode 100644
index 0000000000000000000000000000000000000000..7b751a9d2102e073d395cff7c98b1f66defa671a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6100_5ace67b39a4a62859974.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5ace67b39a4a62859974e379b98ddd46f9d916dd631659c440dacb0f22aed292
+size 985987
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6100_b54d379780eda4055799.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6100_b54d379780eda4055799.png
new file mode 100644
index 0000000000000000000000000000000000000000..c19d20f889cc5c0c6c43da50850709a8bc940944
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6100_b54d379780eda4055799.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b54d379780eda4055799e49011fbbe7f02c32b647f08490e79fd478f62a5f759
+size 559933
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6100_e8e59925bbf72b4c06ac.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6100_e8e59925bbf72b4c06ac.png
new file mode 100644
index 0000000000000000000000000000000000000000..4b63672a204e849be7c0ad47aa52ff3ddacf2373
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6100_e8e59925bbf72b4c06ac.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e8e59925bbf72b4c06ac7d4067ed22950c0178321d2d2b321dd27945065ae866
+size 965578
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6300_0770361260811ca2a2a6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6300_0770361260811ca2a2a6.png
new file mode 100644
index 0000000000000000000000000000000000000000..baba8118d3adfc764d41f5ae415328fab68dec54
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6300_0770361260811ca2a2a6.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0770361260811ca2a2a6fbe0a03d2a90b28990a891e439e4f269e2153197473e
+size 708320
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6300_2d165f2c4570520a67ca.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6300_2d165f2c4570520a67ca.png
new file mode 100644
index 0000000000000000000000000000000000000000..6f98cd71c66373b1cf62790347d174259ef68183
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6300_2d165f2c4570520a67ca.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2d165f2c4570520a67ca12699fc770efffa2a396d4f6c2271d493aaf03738be5
+size 505391
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6300_8225213c63c3a0951f57.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6300_8225213c63c3a0951f57.png
new file mode 100644
index 0000000000000000000000000000000000000000..00a4912df7cff17c10b21b0c9606307a8c9c2ecf
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6300_8225213c63c3a0951f57.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8225213c63c3a0951f57cbf9ed52964f8183a4ead6a07611dffe3cb22c6f2891
+size 900181
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6300_b33520d306f33c679bcc.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6300_b33520d306f33c679bcc.png
new file mode 100644
index 0000000000000000000000000000000000000000..25c458f65c00ed9dc56da0dfb58dd7d2530abac7
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6300_b33520d306f33c679bcc.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b33520d306f33c679bccf4b784a56113fa0dd2077b91be72237a37412615cb12
+size 878097
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6500_15528fc7c9be075ea8ec.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6500_15528fc7c9be075ea8ec.png
new file mode 100644
index 0000000000000000000000000000000000000000..90c0fd3f8434bdcefc49202a2e386aae1501762f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6500_15528fc7c9be075ea8ec.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:15528fc7c9be075ea8ec1e500f9347e2a58e38234eafa99d6f5eb5e74abdac91
+size 940166
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6500_41d6672679ae07863b2d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6500_41d6672679ae07863b2d.png
new file mode 100644
index 0000000000000000000000000000000000000000..755bf6a4a2b3ab09a3525c6cc892a4a297012128
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6500_41d6672679ae07863b2d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:41d6672679ae07863b2df647c9b08835382fceee48bfe6eea55d58906edeb346
+size 826071
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6500_50d54e28f4dd684f6230.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6500_50d54e28f4dd684f6230.png
new file mode 100644
index 0000000000000000000000000000000000000000..84931e5cb290af3f509891d8632a8c6947c3c4f3
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6500_50d54e28f4dd684f6230.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:50d54e28f4dd684f62302e88aee7d97ba6ef7174918cea1ca47aa5e0d58b2916
+size 393283
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6500_7e263c3b230926ff0c67.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6500_7e263c3b230926ff0c67.png
new file mode 100644
index 0000000000000000000000000000000000000000..52f1f566b6f16cb800229566ea75bde8e5b9c3ac
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6500_7e263c3b230926ff0c67.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7e263c3b230926ff0c67022df36957e8ccbeb716a37ad625fc1cd4a26bbc4a64
+size 312572
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6700_496f4a1a5ead2232673c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6700_496f4a1a5ead2232673c.png
new file mode 100644
index 0000000000000000000000000000000000000000..7651104b57b427b562da519b0fc8e1e0620c2eed
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6700_496f4a1a5ead2232673c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:496f4a1a5ead2232673c179cb72e71a6e1fb81bc932ce442e76a2b61788364a4
+size 517078
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6700_5d87d8510abb732c8248.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6700_5d87d8510abb732c8248.png
new file mode 100644
index 0000000000000000000000000000000000000000..640b5ffc368961d38288c6736bebbabb32187189
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6700_5d87d8510abb732c8248.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5d87d8510abb732c82482d8644f991a8338c6e652ecac62eb3d9d795e567d173
+size 772106
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6700_c5db3a63f358fd3a3181.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6700_c5db3a63f358fd3a3181.png
new file mode 100644
index 0000000000000000000000000000000000000000..4040abe147a23efc7b4da652bb420b2503a206c7
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6700_c5db3a63f358fd3a3181.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c5db3a63f358fd3a31817483f03fc67422ac2b79fc15a72b1850be1038ed7fd5
+size 448409
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6700_cb3341922a8600becb18.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6700_cb3341922a8600becb18.png
new file mode 100644
index 0000000000000000000000000000000000000000..8e83ee580a2745055b572a108ab8b998e15e4526
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6700_cb3341922a8600becb18.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:cb3341922a8600becb18ed7b6675654b223be135caf7c3106b801d26364ed4b4
+size 323921
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6900_599caef5e76a6f86ebf4.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6900_599caef5e76a6f86ebf4.png
new file mode 100644
index 0000000000000000000000000000000000000000..7ff6bcfb0b348e4864846edea6e9d48b70be2362
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6900_599caef5e76a6f86ebf4.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:599caef5e76a6f86ebf45956f92d3ba39660048a78d7dfd985476853e772a6a7
+size 717460
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6900_6b1b54b0707ef0f67fe2.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6900_6b1b54b0707ef0f67fe2.png
new file mode 100644
index 0000000000000000000000000000000000000000..0167da9b48e32925032c635470333792472b4313
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6900_6b1b54b0707ef0f67fe2.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6b1b54b0707ef0f67fe2f5aa17863ae8f9e18c9682812d5fce8ab6943e7dea42
+size 1009182
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6900_be59249ffd44076e74cd.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6900_be59249ffd44076e74cd.png
new file mode 100644
index 0000000000000000000000000000000000000000..59e95d6b4ecc735524fb27fca3dd3836ba8b6d84
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6900_be59249ffd44076e74cd.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:be59249ffd44076e74cd944443e6504a754e7c91e0374afa8581de4e76812700
+size 505639
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6900_e69a026bd7622dfa425e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6900_e69a026bd7622dfa425e.png
new file mode 100644
index 0000000000000000000000000000000000000000..8efe965d48e5c6fd2721965682f920b6c2070a8e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_6900_e69a026bd7622dfa425e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e69a026bd7622dfa425e501f845f431fbd14577605f59b16f8746cdfb1ec591b
+size 555408
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_700_3d2d8138d1ef447467c5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_700_3d2d8138d1ef447467c5.png
new file mode 100644
index 0000000000000000000000000000000000000000..f70cc4688ed184da6700fcb4c04b732bc303dda3
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_700_3d2d8138d1ef447467c5.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3d2d8138d1ef447467c5bc6be559902c8a14885782cf6153b36e807847dede93
+size 593756
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_700_9e01af0e7fe6c96a1cff.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_700_9e01af0e7fe6c96a1cff.png
new file mode 100644
index 0000000000000000000000000000000000000000..aeff9b11c7ba46e0ddf588a03f9f82fcb83b6cee
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_700_9e01af0e7fe6c96a1cff.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9e01af0e7fe6c96a1cff2e7b741fdae98904e6b7523108afe9a4115759f06452
+size 608506
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_700_aef2be6b10de57ad3188.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_700_aef2be6b10de57ad3188.png
new file mode 100644
index 0000000000000000000000000000000000000000..037f7794c24bea3f47ad87e8d848794766750905
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_700_aef2be6b10de57ad3188.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:aef2be6b10de57ad3188405814941fbb32ee62b355396e95935ca4a5436a3af7
+size 754364
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_700_d2d17fb415a7c62bb179.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_700_d2d17fb415a7c62bb179.png
new file mode 100644
index 0000000000000000000000000000000000000000..2da9589d3ee68b0341c55cb6b52c6833c0d29fc0
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_700_d2d17fb415a7c62bb179.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d2d17fb415a7c62bb1798585e28fba21786ff3708bb13296dc09ddf9503a6614
+size 603906
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7100_51c33c8c04876f495a03.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7100_51c33c8c04876f495a03.png
new file mode 100644
index 0000000000000000000000000000000000000000..de93699cc128d88034bc980042e60f90cbde11b0
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7100_51c33c8c04876f495a03.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:51c33c8c04876f495a03e33e39cb0de12cc5acfebaa1b3f9d7c7defec381e9e2
+size 354298
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7100_521f7ac2cd1e1066f5b3.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7100_521f7ac2cd1e1066f5b3.png
new file mode 100644
index 0000000000000000000000000000000000000000..a06b3e16be3ee23cee2b8bdbd13ea269d18dccac
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7100_521f7ac2cd1e1066f5b3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:521f7ac2cd1e1066f5b32be301e8e42823de7c11ff9321ab64ce8b6a98f31190
+size 322159
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7100_75ab92ecc29743d47eca.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7100_75ab92ecc29743d47eca.png
new file mode 100644
index 0000000000000000000000000000000000000000..1eb2ab7d08ea0e73447b76a209516e3259d4b5d8
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7100_75ab92ecc29743d47eca.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:75ab92ecc29743d47eca8d839ad7926eecae2299030523a4fcc0f35af5421db9
+size 848812
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7100_8499e1ae07479b0649be.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7100_8499e1ae07479b0649be.png
new file mode 100644
index 0000000000000000000000000000000000000000..0f67d9ab4b22eb1777422f8201b1f125c43a24c2
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7100_8499e1ae07479b0649be.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8499e1ae07479b0649bed6eac6d8a877086ae74ddbd6414c91f7a1685200cb5d
+size 614235
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7300_5142a7ad9c724e2f4bbd.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7300_5142a7ad9c724e2f4bbd.png
new file mode 100644
index 0000000000000000000000000000000000000000..632842aed61b8c2a99ecf4c95df6899edb8465f9
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7300_5142a7ad9c724e2f4bbd.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5142a7ad9c724e2f4bbd59f1cc08bf79dae6c67c424d68adf91017f0f89d0a75
+size 493972
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7300_8b9b479f066838a4c0e1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7300_8b9b479f066838a4c0e1.png
new file mode 100644
index 0000000000000000000000000000000000000000..85698d480a487e7b168f72841d5f95a4aecace2f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7300_8b9b479f066838a4c0e1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8b9b479f066838a4c0e1e9d8f88be4ad93402f1300f33ae3bc47fd541f6a94e8
+size 1447392
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7300_d3de116ea9a2f8fbc13d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7300_d3de116ea9a2f8fbc13d.png
new file mode 100644
index 0000000000000000000000000000000000000000..d806125ef446c146f2740c96d49fa37ee1d7d56c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7300_d3de116ea9a2f8fbc13d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d3de116ea9a2f8fbc13dd3a1771dce707620561e827d54b93801199c328b73a1
+size 477634
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7300_ef1043dc4c76d9824ea1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7300_ef1043dc4c76d9824ea1.png
new file mode 100644
index 0000000000000000000000000000000000000000..55ea2aab8e45ccc701548c3a5d8f8cf50295ea26
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7300_ef1043dc4c76d9824ea1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ef1043dc4c76d9824ea14449a15ead48aa0798c91afe2c6374fab8c01de44ad4
+size 437369
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7500_124cc3a5ada679d21e39.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7500_124cc3a5ada679d21e39.png
new file mode 100644
index 0000000000000000000000000000000000000000..5c5eaa5585ef33fdabbd3eaa694319749f02948e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7500_124cc3a5ada679d21e39.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:124cc3a5ada679d21e39a69373cfebcd68d0d9f022f5f28734ab262f52b10901
+size 688911
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7500_549a1c891f09b0ffd08d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7500_549a1c891f09b0ffd08d.png
new file mode 100644
index 0000000000000000000000000000000000000000..a0604f22a30e6351f5814bd84fd2039f0b209763
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7500_549a1c891f09b0ffd08d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:549a1c891f09b0ffd08d6687563ae477150224854ec96c6c0ae99c13788bbd32
+size 744501
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7500_8e3a3f7f531eff602c5b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7500_8e3a3f7f531eff602c5b.png
new file mode 100644
index 0000000000000000000000000000000000000000..b504a98f4c1024025fbb63511adfdc70a2cdde52
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7500_8e3a3f7f531eff602c5b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8e3a3f7f531eff602c5b008471c84676f1ea6b8c88e30a5b1e1bd8dd0ccf6bc1
+size 943782
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7500_b0c03b8499bbfb03c9cc.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7500_b0c03b8499bbfb03c9cc.png
new file mode 100644
index 0000000000000000000000000000000000000000..414ca64d44fdc42eb524f8d1f80f4639b21fc6e0
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7500_b0c03b8499bbfb03c9cc.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b0c03b8499bbfb03c9cc074844ffd143bd6fc629abb0140e59004fa95ffb92ad
+size 897074
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7700_58d9d7880a91c5be2fad.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7700_58d9d7880a91c5be2fad.png
new file mode 100644
index 0000000000000000000000000000000000000000..27afa421c9880efd79b155db609a2373a7516be9
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7700_58d9d7880a91c5be2fad.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:58d9d7880a91c5be2faddea4c8390b8ce16e920ba5098d11330c9bba9135c5a5
+size 624561
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7700_73351f21dd80d4d2a669.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7700_73351f21dd80d4d2a669.png
new file mode 100644
index 0000000000000000000000000000000000000000..3b2293211ed9b6e8f7047b32a229751aa54e9552
Binary files /dev/null and b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7700_73351f21dd80d4d2a669.png differ
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7700_9481c6e06d0d81d25a8c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7700_9481c6e06d0d81d25a8c.png
new file mode 100644
index 0000000000000000000000000000000000000000..f72850e0c3972147c04ce9478f6894d481d85db2
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7700_9481c6e06d0d81d25a8c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9481c6e06d0d81d25a8cebd23a6b92127241c7d4e9d314875f4d5063d53126bc
+size 1209508
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7700_e86f1fafa21bf5dd2772.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7700_e86f1fafa21bf5dd2772.png
new file mode 100644
index 0000000000000000000000000000000000000000..00b7142ce4d74043497e41ba0c366a6fc4d6d173
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7700_e86f1fafa21bf5dd2772.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e86f1fafa21bf5dd2772f218555716f5d17be7f29028f4e381ce833d8a5b5016
+size 675552
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7900_017547460d75fe74fd71.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7900_017547460d75fe74fd71.png
new file mode 100644
index 0000000000000000000000000000000000000000..fcb43caa4dbd39d4d82c90c9bba8ac981ef9cc96
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7900_017547460d75fe74fd71.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:017547460d75fe74fd71eb51892fe69637cafb817516a48afe351743ea42c647
+size 226380
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7900_43e1556f609ccb4d2b26.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7900_43e1556f609ccb4d2b26.png
new file mode 100644
index 0000000000000000000000000000000000000000..a98df831e267a78c240218e57c03de4e2e01155f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7900_43e1556f609ccb4d2b26.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:43e1556f609ccb4d2b267f27f6805f1033c7afe73a66b90367336f0db1d12f87
+size 184159
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7900_67dafceb055957590ff5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7900_67dafceb055957590ff5.png
new file mode 100644
index 0000000000000000000000000000000000000000..31394834a03093dbf9519b7819ffb8b84aff73fd
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7900_67dafceb055957590ff5.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:67dafceb055957590ff532c27c300ae15488b8eb484bb33d6210b8bf679ba388
+size 429619
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7900_7283a4ca4791d0a09ef7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7900_7283a4ca4791d0a09ef7.png
new file mode 100644
index 0000000000000000000000000000000000000000..520aaf2e7ee0f0049ad800598713df84e1f2f71f
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_7900_7283a4ca4791d0a09ef7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7283a4ca4791d0a09ef7b72d8982763282d9583436c173deebe536a6834a68a2
+size 132114
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8100_6b1fcbbb8baad4ee6a06.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8100_6b1fcbbb8baad4ee6a06.png
new file mode 100644
index 0000000000000000000000000000000000000000..aecc3deb7627c25e4b0957e18be5ba6148e65c51
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8100_6b1fcbbb8baad4ee6a06.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6b1fcbbb8baad4ee6a06be300cd88568179d42a6e890ab46ed47def27b8eadf8
+size 986416
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8100_8912a1f4e1fe4cc0aafa.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8100_8912a1f4e1fe4cc0aafa.png
new file mode 100644
index 0000000000000000000000000000000000000000..249978a75f2bc85d77651cee7fee4615cc25e37e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8100_8912a1f4e1fe4cc0aafa.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8912a1f4e1fe4cc0aafafa1517ce56029dd66ca4b318d4926a29a19d053ceba8
+size 552655
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8100_bbb397f5e8a04ff76793.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8100_bbb397f5e8a04ff76793.png
new file mode 100644
index 0000000000000000000000000000000000000000..ea69a7e86e754d5563e42c54b97824332c019ecd
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8100_bbb397f5e8a04ff76793.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:bbb397f5e8a04ff76793d1eb623f34e8e26780badf736719f55195f44399c981
+size 527661
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8100_e37ae2773066fa8b181d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8100_e37ae2773066fa8b181d.png
new file mode 100644
index 0000000000000000000000000000000000000000..489297814326af2128ba710bf6f06449e096d8b9
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8100_e37ae2773066fa8b181d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e37ae2773066fa8b181d93d5a3930a27d267e811cc6644973c5aba7ce17fd82f
+size 756822
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8300_05aac2448ba0b620db82.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8300_05aac2448ba0b620db82.png
new file mode 100644
index 0000000000000000000000000000000000000000..107a684eb18ba2b944af04814e4b33dd29b7141a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8300_05aac2448ba0b620db82.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:05aac2448ba0b620db82cbd5851c97354ccfa2f6c9756b2059e0e9d81a484c66
+size 1068516
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8300_69e9d9cae22b0933c00a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8300_69e9d9cae22b0933c00a.png
new file mode 100644
index 0000000000000000000000000000000000000000..e7525bf2ae35973601250e2362044bb936829bb8
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8300_69e9d9cae22b0933c00a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:69e9d9cae22b0933c00a4a243436fd1623161fb0def8d6ffa6ba16d875715f43
+size 601418
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8300_9a0679b8d25a2f3544fe.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8300_9a0679b8d25a2f3544fe.png
new file mode 100644
index 0000000000000000000000000000000000000000..71b136c3e41339278fdf1703ab9650eb2fcc2f21
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8300_9a0679b8d25a2f3544fe.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9a0679b8d25a2f3544fed4603ac6c19a0135c98ba5076f45fa896258507e7e80
+size 418336
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8300_bf8368573bfa25cba069.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8300_bf8368573bfa25cba069.png
new file mode 100644
index 0000000000000000000000000000000000000000..b8e5df82a7ef5e61be1b19334046b7e896acbe76
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8300_bf8368573bfa25cba069.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:bf8368573bfa25cba0692b5c6b36492cb80c367d4ea28d41d6b43b4541415ad9
+size 276723
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8500_22c9b204ce66582f1a1c.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8500_22c9b204ce66582f1a1c.png
new file mode 100644
index 0000000000000000000000000000000000000000..aa484dd7a9b8b7af88a5cbec2fb524a3caa10c25
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8500_22c9b204ce66582f1a1c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:22c9b204ce66582f1a1c1cec2a554ad404a830cff923800f61528c6c56cc4ee7
+size 167308
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8500_3d84301018b1d13ef086.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8500_3d84301018b1d13ef086.png
new file mode 100644
index 0000000000000000000000000000000000000000..b320d7fb91ee26fe3ff1e97733c97c3cebf83cf1
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8500_3d84301018b1d13ef086.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3d84301018b1d13ef0862ee4a3772719cf4feeead9736b7bca353eb8678a2970
+size 737713
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8500_607c0107a02871d72142.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8500_607c0107a02871d72142.png
new file mode 100644
index 0000000000000000000000000000000000000000..69b010033bacd6a238152fd28f24b0b77423b306
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8500_607c0107a02871d72142.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:607c0107a02871d72142d3707d6997ec42db21179915f269bfb91eab26828416
+size 1321007
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8500_ef3634cf1158db63c8f7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8500_ef3634cf1158db63c8f7.png
new file mode 100644
index 0000000000000000000000000000000000000000..52c922783e9caf3e38b409825aeac4e052b36f98
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8500_ef3634cf1158db63c8f7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ef3634cf1158db63c8f715703eab08f0a490a34cf5b0d9acf961e80bf778843c
+size 746134
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8700_001da77b021517330246.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8700_001da77b021517330246.png
new file mode 100644
index 0000000000000000000000000000000000000000..f9d77e3bb6a60b056425ff7ba717eafb2b8ea28c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8700_001da77b021517330246.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:001da77b0215173302468a30fafd9447d6496768054b795d192cc10b4b396afb
+size 619953
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8700_7ffe115d2ab11efe0bf6.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8700_7ffe115d2ab11efe0bf6.png
new file mode 100644
index 0000000000000000000000000000000000000000..3881522937fa98c718798d8a4ae43085591be17c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8700_7ffe115d2ab11efe0bf6.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7ffe115d2ab11efe0bf66921c1926c982a2cb1764ebda83dfe767f999f9c66dd
+size 411849
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8700_c8abf3476da0570d4a2a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8700_c8abf3476da0570d4a2a.png
new file mode 100644
index 0000000000000000000000000000000000000000..d262236f4938640c035ed00b03fb0bf0d801ed80
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8700_c8abf3476da0570d4a2a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c8abf3476da0570d4a2a8bdf85299d562b3fffa1156ad0e90ef19d3fe3ae1aa5
+size 696590
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8700_f5e8e7b15c11775c3692.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8700_f5e8e7b15c11775c3692.png
new file mode 100644
index 0000000000000000000000000000000000000000..06ff39de7fa5b517b5f9c22af8130e717a64ed64
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8700_f5e8e7b15c11775c3692.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f5e8e7b15c11775c36922df2bec76b331fe64631ff6cee4232dcc30cac5fb986
+size 751027
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8900_295bf98a3d51b0f28035.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8900_295bf98a3d51b0f28035.png
new file mode 100644
index 0000000000000000000000000000000000000000..9a6e3f298ddf0f840751493183135dad6d013f5e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8900_295bf98a3d51b0f28035.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:295bf98a3d51b0f28035364d0c117ca4668c37d76f4fbd27753568e3bb2fe0b3
+size 693314
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8900_7d0644cdfd1acddad66a.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8900_7d0644cdfd1acddad66a.png
new file mode 100644
index 0000000000000000000000000000000000000000..ac2fca40d2c474158cea54c5b01d269f1e75834a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8900_7d0644cdfd1acddad66a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7d0644cdfd1acddad66afedf0360a03d0e06a576aaeb351e4fd8d7f5c35e90a2
+size 531268
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8900_d2e0b972bdbf5df9cbeb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8900_d2e0b972bdbf5df9cbeb.png
new file mode 100644
index 0000000000000000000000000000000000000000..561169fcc958c3f3775c463092db6da8b43496b3
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8900_d2e0b972bdbf5df9cbeb.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d2e0b972bdbf5df9cbeb73f27d9c1b37723f53e93ed988052d1295d27bea35fd
+size 663557
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8900_d9fc0bdf166b6651c770.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8900_d9fc0bdf166b6651c770.png
new file mode 100644
index 0000000000000000000000000000000000000000..13335a8ab93793fb651df48b083a6ea4134c4446
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_8900_d9fc0bdf166b6651c770.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d9fc0bdf166b6651c770449ac291eb2800686a87e1b7c68687615c46b1d5929c
+size 904950
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_900_1fd86ef5439a1041a030.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_900_1fd86ef5439a1041a030.png
new file mode 100644
index 0000000000000000000000000000000000000000..fbd21adc65b61514bb8a7143fc836ed7988a78b9
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_900_1fd86ef5439a1041a030.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1fd86ef5439a1041a030ad312aa18f6eb29058f582414a06934eef26e85fe97f
+size 177488
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_900_32b5f05e33187a99fdac.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_900_32b5f05e33187a99fdac.png
new file mode 100644
index 0000000000000000000000000000000000000000..ed7c338fe4b0768252e977451f8f417f98874e53
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_900_32b5f05e33187a99fdac.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:32b5f05e33187a99fdac5cead0b61eadc94a215e2bb5a4d6920c49969e283558
+size 175097
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_900_8a3dd53b8ae01e526eb7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_900_8a3dd53b8ae01e526eb7.png
new file mode 100644
index 0000000000000000000000000000000000000000..f1635ad7d26e8ca6379f1872b33b9bd560c8422c
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_900_8a3dd53b8ae01e526eb7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8a3dd53b8ae01e526eb7128ba35c9c9c98134bba49650c315015bd5fd2231225
+size 148327
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_900_ebc37f56cd409382a664.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_900_ebc37f56cd409382a664.png
new file mode 100644
index 0000000000000000000000000000000000000000..9effeeb349ced613931188fc6a6bacdcc5bfcdee
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_900_ebc37f56cd409382a664.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ebc37f56cd409382a6642ef75c1910c4b925f9f3d2882920f1501e400c08c565
+size 352154
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9100_3511c54296fbbbc9ebf0.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9100_3511c54296fbbbc9ebf0.png
new file mode 100644
index 0000000000000000000000000000000000000000..60786e38f22c6661a89446b3920fe7df6449e3b2
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9100_3511c54296fbbbc9ebf0.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3511c54296fbbbc9ebf08a56f441e0be51bdf3bec4f56be66409035f9d9249ab
+size 1143951
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9100_67ff720ea7b5c04bb1bb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9100_67ff720ea7b5c04bb1bb.png
new file mode 100644
index 0000000000000000000000000000000000000000..0b8668e563cea41869f19a9ccd3be271028fc5d4
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9100_67ff720ea7b5c04bb1bb.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:67ff720ea7b5c04bb1bb7d06562a4ea3b720d9d1a043efe91ae83db197826842
+size 836616
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9100_8d672a0c9ef946761f5b.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9100_8d672a0c9ef946761f5b.png
new file mode 100644
index 0000000000000000000000000000000000000000..a419e9fa1df2c45f975f8b27d4141bd847417ea5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9100_8d672a0c9ef946761f5b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8d672a0c9ef946761f5bb99d64fadf91679d181efec06c461ca22f22b2df5523
+size 925819
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9100_b48d3ff4e4cc4618b59d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9100_b48d3ff4e4cc4618b59d.png
new file mode 100644
index 0000000000000000000000000000000000000000..69f90fa4a35d50637071a3c28cda388653504a35
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9100_b48d3ff4e4cc4618b59d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b48d3ff4e4cc4618b59d99dbb038facd4d90341f56171a920e5aa7755f5dbbe6
+size 602213
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9300_0a1b1a784793e4208de5.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9300_0a1b1a784793e4208de5.png
new file mode 100644
index 0000000000000000000000000000000000000000..292f27e8f79b12a9bd5d88dd84ebcc839f78dbc5
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9300_0a1b1a784793e4208de5.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0a1b1a784793e4208de5e4ef131d3b50686ac27f2f91a9ee7103733182fa9f2f
+size 539579
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9300_787ae35159d11553f9e9.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9300_787ae35159d11553f9e9.png
new file mode 100644
index 0000000000000000000000000000000000000000..d6c1fcdcd293bc0e975927b202ab6a3f9de97b14
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9300_787ae35159d11553f9e9.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:787ae35159d11553f9e9fbed0dae4323a87cbb4c33ffa8e7894ae004719d2828
+size 930411
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9300_b5c2292feec29a7ca5a7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9300_b5c2292feec29a7ca5a7.png
new file mode 100644
index 0000000000000000000000000000000000000000..93b40c6cf5403ad8212392920cbb7e03cb1d4093
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9300_b5c2292feec29a7ca5a7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b5c2292feec29a7ca5a71cd24b2da84dea3f5f91749cf85ec64b7c7c2d89cf65
+size 503018
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9300_de79d1c6059a175f846e.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9300_de79d1c6059a175f846e.png
new file mode 100644
index 0000000000000000000000000000000000000000..2b86c5d57a0be2132efffffdade9ab67d657699d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9300_de79d1c6059a175f846e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:de79d1c6059a175f846e9a44ec93aa949167ab4f22e96cf4f7f96ed0cd56cdc5
+size 1004471
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9500_0802df2efd03409311f1.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9500_0802df2efd03409311f1.png
new file mode 100644
index 0000000000000000000000000000000000000000..f80c40c42e59ca6443b8c2e2869089a66db86143
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9500_0802df2efd03409311f1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0802df2efd03409311f1880c2f370a5748a352c93ecdc76f34245f5c09a9a404
+size 427194
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9500_325b5ac18fd7d6762bfb.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9500_325b5ac18fd7d6762bfb.png
new file mode 100644
index 0000000000000000000000000000000000000000..8c7b926f308258a9b0cc43f4c8e048d15db974cf
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9500_325b5ac18fd7d6762bfb.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:325b5ac18fd7d6762bfb3c4cc0521ed152a8519fb5ae8e443e8bda86c25a49d3
+size 981582
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9500_8c88f61860302406fbca.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9500_8c88f61860302406fbca.png
new file mode 100644
index 0000000000000000000000000000000000000000..67396e40cdf977e5d4af7458c05a9b71d885141d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9500_8c88f61860302406fbca.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8c88f61860302406fbcad97e635adb7dd0aac7f4bbafd8b0a79c79d2d57c1784
+size 465903
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9500_a7c3963344aa38557e70.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9500_a7c3963344aa38557e70.png
new file mode 100644
index 0000000000000000000000000000000000000000..984aba40a1433da97a036d6899f3e4b3acd27d2a
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9500_a7c3963344aa38557e70.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a7c3963344aa38557e70c6a68f9c1384856a6531aa2d55fcb893cf631805040d
+size 693453
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9700_00ac8f1f8c0ff7bab356.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9700_00ac8f1f8c0ff7bab356.png
new file mode 100644
index 0000000000000000000000000000000000000000..fc268b3dc653ff178f86181b601b054fa727b4f1
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9700_00ac8f1f8c0ff7bab356.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:00ac8f1f8c0ff7bab3563a9317bf287fe020ef8eee774c89c250e74d669f0d46
+size 577948
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9700_83982269f6e332ac38a7.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9700_83982269f6e332ac38a7.png
new file mode 100644
index 0000000000000000000000000000000000000000..6e2607885c19160f53fc2b41891d0145b59b3b41
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9700_83982269f6e332ac38a7.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:83982269f6e332ac38a75e1be14d6716af1bcd17f2ee4dea2040831d205ba5e0
+size 1014982
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9700_c098dc6cdac3e9ea4b51.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9700_c098dc6cdac3e9ea4b51.png
new file mode 100644
index 0000000000000000000000000000000000000000..f76758f0f99cab3b61e3dc4e94269e9739109c45
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9700_c098dc6cdac3e9ea4b51.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c098dc6cdac3e9ea4b512d5ad69d593cda8d59c418e7cd07a5792b747a11efe3
+size 895833
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9700_cdc3e58e2b2242a74f15.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9700_cdc3e58e2b2242a74f15.png
new file mode 100644
index 0000000000000000000000000000000000000000..c774e67c4d93212c2b8da17fb26d076d879006a3
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9700_cdc3e58e2b2242a74f15.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:cdc3e58e2b2242a74f157ff9a22d6be240f07a66d9163ab1b07faff11b116197
+size 1170242
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9900_485a3de410dd65d002da.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9900_485a3de410dd65d002da.png
new file mode 100644
index 0000000000000000000000000000000000000000..43dd4cd439a25ea8d9105744cb2e8066a4e5a8a1
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9900_485a3de410dd65d002da.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:485a3de410dd65d002da2a24157949d5d4979929b797193915bc88f6264326c3
+size 646373
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9900_84870fc9370007dcd599.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9900_84870fc9370007dcd599.png
new file mode 100644
index 0000000000000000000000000000000000000000..1e10972e715629ef35fb4fe941e5b1bcde85da3e
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9900_84870fc9370007dcd599.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:84870fc9370007dcd599845f3d60244f995bb540ae857782145e606cfcef713e
+size 509230
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9900_ad818b8d0a06d3d06f3f.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9900_ad818b8d0a06d3d06f3f.png
new file mode 100644
index 0000000000000000000000000000000000000000..a15b42390cf09700ed08425042f8e544f9f3ae0d
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9900_ad818b8d0a06d3d06f3f.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ad818b8d0a06d3d06f3f205a50416fac72ec66bff770b7056bd87b12b24a3c45
+size 982456
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9900_e8e7e153f3e3ea552f5d.png b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9900_e8e7e153f3e3ea552f5d.png
new file mode 100644
index 0000000000000000000000000000000000000000..fdef3c1114c6727a1c6f16f214a74ef7b8edbda1
--- /dev/null
+++ b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/media/images/generated_videos_grid_9900_e8e7e153f3e3ea552f5d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e8e7e153f3e3ea552f5d9f862b701542693e17c2b8a925721e00670dfb5b6354
+size 562540
diff --git a/Meissonic/wandb/run-20251219_095320-szkwvitk/files/wandb-summary.json b/Meissonic/wandb/run-20251219_095320-szkwvitk/files/wandb-summary.json
new file mode 100644
index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
diff --git a/Meissonic/wandb/run-20251229_081634-hjn0m6c2/files/config.yaml b/Meissonic/wandb/run-20251229_081634-hjn0m6c2/files/config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..92493c52e2bf2aa0612c78651c69727dd92d2aac
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_081634-hjn0m6c2/files/config.yaml
@@ -0,0 +1,309 @@
+_wandb:
+ value:
+ cli_version: 0.23.1
+ e:
+ atfv17l5d8rtbeeoodvcjytuxj5qt2ye:
+ args:
+ - --use_precomputed_video_only
+ - --features_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_256_256_full_set
+ - --text_encoder_architecture
+ - umt5-xxl
+ - --wan_pretrained_path
+ - /mnt/Wan2.1-T2V-1.3B
+ - --training_from_scratch
+ - "True"
+ - --pretrained_model_name_or_path
+ - dummy
+ - --wan_backbone_lr_ratio
+ - "0.2"
+ - --num_frames
+ - "17"
+ - --video_height
+ - "256"
+ - --video_width
+ - "256"
+ - --dataloader_num_workers
+ - "8"
+ - --video_tokenizer_model_id
+ - Cosmos-0.1-Tokenizer-DV4x8x8
+ - --instance_dataset
+ - OpenVid1MDataset
+ - --instance_data_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+ - --train_batch_size
+ - "2"
+ - --gradient_accumulation_steps
+ - "4"
+ - --learning_rate
+ - "3e-3"
+ - --max_train_steps
+ - "100000"
+ - --checkpointing_steps
+ - "500"
+ - --validation_steps
+ - "100"
+ - --logging_steps
+ - "10"
+ - --validation_prompts
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+ - --output_dir
+ - ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+ - --mixed_precision
+ - bf16
+ - --lr_scheduler
+ - constant
+ - --lr_warmup_steps
+ - "0"
+ - --use_8bit_adam
+ - --gradient_checkpointing
+ - --min_masking_rate
+ - "0.0"
+ - --cond_dropout_prob
+ - "0.0"
+ - --split_vae_encode
+ - "1"
+ - --allow_tf32
+ - --seed
+ - "42"
+ - --report_to
+ - wandb
+ codePath: train/train_mei_video.py
+ codePathLocal: train/train_mei_video.py
+ cpu_count: 48
+ cpu_count_logical: 96
+ cudaVersion: "12.8"
+ disk:
+ /:
+ total: "16650112278528"
+ used: "15576127954944"
+ email: jinbin5bai@gmail.com
+ executable: /home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10
+ git:
+ commit: 6819d374ef1b86bdedad373aab1121a89687e5cf
+ remote: https://github.com/viiika/Meissonic.git
+ gpu: NVIDIA A100-SXM4-40GB
+ gpu_count: 8
+ gpu_nvidia:
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-71102f28-cd17-57e7-6181-120bf743d23d
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-303ab142-3206-9a14-c758-58ab97d7510e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-efb2d1fc-1eed-653d-ed51-5273085154ba
+ host: ip-172-31-91-136
+ memory:
+ total: "1204521451520"
+ os: Linux-6.8.0-1027-aws-x86_64-with-glibc2.35
+ program: /mnt/Meissonic/train/train_mei_video.py
+ python: CPython 3.10.19
+ root: /mnt/Meissonic
+ startedAt: "2025-12-29T08:16:34.855351Z"
+ writerId: atfv17l5d8rtbeeoodvcjytuxj5qt2ye
+ m: []
+ python_version: 3.10.19
+ t:
+ "1":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "2":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "4": 3.10.19
+ "5": 0.23.1
+ "6": 4.57.3
+ "12": 0.23.1
+ "13": linux-x86_64
+adam_beta1:
+ value: 0.9
+adam_beta2:
+ value: 0.999
+adam_epsilon:
+ value: 1e-08
+adam_weight_decay:
+ value: 0.01
+allow_tf32:
+ value: true
+checkpointing_steps:
+ value: 500
+checkpoints_total_limit:
+ value: null
+cond_dropout_prob:
+ value: 0
+dataloader_num_workers:
+ value: 8
+dataloader_prefetch_factor:
+ value: 2
+ema_decay:
+ value: 0.9999
+ema_update_after_step:
+ value: 0
+empty_embeds_path:
+ value: null
+features_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_256_256_full_set
+freeze_wan_backbone:
+ value: false
+gradient_accumulation_steps:
+ value: 4
+gradient_checkpointing:
+ value: true
+image_key:
+ value: null
+instance_data_dataset:
+ value: null
+instance_data_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+instance_data_image:
+ value: null
+instance_dataset:
+ value: OpenVid1MDataset
+learning_rate:
+ value: 0.003
+logging_dir:
+ value: logs
+logging_steps:
+ value: 10
+lora_alpha:
+ value: 32
+lora_r:
+ value: 16
+lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+lr_scheduler:
+ value: constant
+lr_warmup_steps:
+ value: 0
+max_grad_norm:
+ value: 50
+max_train_steps:
+ value: 100000
+min_masking_rate:
+ value: 0
+mixed_precision:
+ value: bf16
+num_frames:
+ value: 17
+output_dir:
+ value: ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+pretrained_model_name_or_path:
+ value: dummy
+prompt_key:
+ value: null
+prompt_prefix:
+ value: null
+report_to:
+ value: wandb
+resolution:
+ value: 512
+resume_from_checkpoint:
+ value: null
+revision:
+ value: null
+scale_lr:
+ value: false
+seed:
+ value: 42
+split_vae_encode:
+ value: 1
+text_encoder_architecture:
+ value: umt5-xxl
+text_encoder_lora_alpha:
+ value: 32
+text_encoder_lora_r:
+ value: 16
+text_encoder_lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+text_encoder_use_lora:
+ value: false
+train_batch_size:
+ value: 2
+train_text_encoder:
+ value: false
+training_from_scratch:
+ value: true
+use_8bit_adam:
+ value: true
+use_ema:
+ value: false
+use_lora:
+ value: false
+use_precomputed_features:
+ value: false
+use_precomputed_video_only:
+ value: true
+validation_prompts:
+ value:
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+validation_steps:
+ value: 100
+variant:
+ value: null
+video_height:
+ value: 256
+video_tokenizer_model_id:
+ value: Cosmos-0.1-Tokenizer-DV4x8x8
+video_width:
+ value: 256
+wan_backbone_lr_ratio:
+ value: 0.2
+wan_pretrained_path:
+ value: /mnt/Wan2.1-T2V-1.3B
diff --git a/Meissonic/wandb/run-20251229_081634-hjn0m6c2/files/requirements.txt b/Meissonic/wandb/run-20251229_081634-hjn0m6c2/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1d01ecad871b6b3baba9900a3b3d370e9205a61d
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_081634-hjn0m6c2/files/requirements.txt
@@ -0,0 +1,151 @@
+ImageIO==2.37.2
+typing-inspection==0.4.2
+av==16.0.1
+dill==0.4.0
+matplotlib==3.10.7
+xxhash==3.6.0
+tap==0.2
+mc_bin_client==1.0.1
+exceptiongroup==1.3.1
+cycler==0.12.1
+einops==0.8.1
+opencv-python==4.12.0.88
+scikit-image==0.25.2
+dashscope==1.25.2
+charset-normalizer==3.4.4
+filelock==3.19.1
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+lazy_loader==0.4
+kiwisolver==1.4.9
+Flask==3.1.2
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+setuptools==80.9.0
+websocket-client==1.9.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+itsdangerous==2.2.0
+pydantic_core==2.41.5
+matrix-game-2.0==0.0.1
+wsproto==1.3.2
+psutil==7.1.3
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+nvidia-cusparselt-cu12==0.7.1
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+datasets==4.4.1
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+lightning==2.6.0
+Flask-SocketIO==5.5.1
+torchvision==0.24.1
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+fonttools==4.61.0
+open_clip_torch==3.2.0
+flash_attn==2.8.3
+mdurl==0.1.2
+pandas==2.3.3
+modelscope==1.32.0
+ftfy==6.3.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+beartype==0.22.8
+dominate==2.9.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+bitsandbytes==0.48.2
+lightning-utilities==0.15.2
+easydict==1.13
+networkx==3.3
+wheel==0.45.1
+timm==1.0.22
+pyparsing==3.2.5
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+pfzy==0.3.4
+httpcore==1.0.9
+multidict==6.7.0
+pycparser==2.23
+regex==2025.11.3
+importlib_metadata==8.7.0
+Werkzeug==3.1.4
+antlr4-python3-runtime==4.9.3
+sentry-sdk==2.46.0
+urllib3==2.5.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+cryptography==46.0.3
+omegaconf==2.3.0
+cffi==2.0.0
+packaging==25.0
+inquirerpy==0.3.4
+aiosignal==1.4.0
+MarkupSafe==2.1.5
+nvidia-cuda-nvrtc-cu12==12.8.93
+tzdata==2025.2
+decord==0.6.0
+async-timeout==5.0.1
+sympy==1.14.0
+numpy==2.1.2
+torch==2.9.1
+diffusers==0.35.2
+nvidia-cuda-cupti-cu12==12.8.90
+smmap==5.0.2
+tifffile==2025.5.10
+safetensors==0.7.0
+gitdb==4.0.12
+blinker==1.9.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+typer-slim==0.20.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+python-engineio==4.12.3
+lmdb==1.7.5
+nvidia-nvtx-cu12==12.8.90
+fsspec==2025.9.0
+markdown-it-py==4.0.0
+six==1.17.0
+platformdirs==4.5.0
+starlette==0.50.0
+scipy==1.15.3
+pycocotools==2.0.10
+accelerate==1.12.0
+zipp==3.23.0
+propcache==0.4.1
+bidict==0.23.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+simple-websocket==1.1.0
+nvidia-curand-cu12==10.3.9.90
+contourpy==1.3.2
+imageio-ffmpeg==0.6.0
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+prompt_toolkit==3.0.52
+pillow==11.3.0
+protobuf==6.33.1
+yarl==1.22.0
+clip==1.0
+nvidia-cudnn-cu12==9.10.2.21
+python-socketio==5.15.0
diff --git a/Meissonic/wandb/run-20251229_081634-hjn0m6c2/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_081634-hjn0m6c2/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..c79ab9dbf3d1b8fc2c70ca259a6f0d77802f6ee8
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_081634-hjn0m6c2/files/wandb-metadata.json
@@ -0,0 +1,158 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.10.19",
+ "startedAt": "2025-12-29T08:16:34.855351Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_256_256_full_set",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "True",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "256",
+ "--video_width",
+ "256",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "2",
+ "--gradient_accumulation_steps",
+ "4",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15576127954944"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "atfv17l5d8rtbeeoodvcjytuxj5qt2ye"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_081634-hjn0m6c2/files/wandb-summary.json b/Meissonic/wandb/run-20251229_081634-hjn0m6c2/files/wandb-summary.json
new file mode 100644
index 0000000000000000000000000000000000000000..8afb95f49483c85658a334253ad61c5e4b5851ef
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_081634-hjn0m6c2/files/wandb-summary.json
@@ -0,0 +1 @@
+{"_wandb":{"runtime":2},"_runtime":2}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_081634-hjn0m6c2/run-hjn0m6c2.wandb b/Meissonic/wandb/run-20251229_081634-hjn0m6c2/run-hjn0m6c2.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..3724cb3010fe6d0a165d913cfbe63fa26068d4cf
Binary files /dev/null and b/Meissonic/wandb/run-20251229_081634-hjn0m6c2/run-hjn0m6c2.wandb differ
diff --git a/Meissonic/wandb/run-20251229_081752-78ojckdj/files/config.yaml b/Meissonic/wandb/run-20251229_081752-78ojckdj/files/config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..572b73e141e35b802ce1d73e632dc7184931285b
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_081752-78ojckdj/files/config.yaml
@@ -0,0 +1,309 @@
+_wandb:
+ value:
+ cli_version: 0.23.1
+ e:
+ zkex3lp5g2z9pkl87qlphzoninilz5fa:
+ args:
+ - --use_precomputed_video_only
+ - --features_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_256_256_full_set
+ - --text_encoder_architecture
+ - umt5-xxl
+ - --wan_pretrained_path
+ - /mnt/Wan2.1-T2V-1.3B
+ - --training_from_scratch
+ - "True"
+ - --pretrained_model_name_or_path
+ - dummy
+ - --wan_backbone_lr_ratio
+ - "0.2"
+ - --num_frames
+ - "17"
+ - --video_height
+ - "256"
+ - --video_width
+ - "256"
+ - --dataloader_num_workers
+ - "8"
+ - --video_tokenizer_model_id
+ - Cosmos-0.1-Tokenizer-DV4x8x8
+ - --instance_dataset
+ - OpenVid1MDataset
+ - --instance_data_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+ - --train_batch_size
+ - "2"
+ - --gradient_accumulation_steps
+ - "4"
+ - --learning_rate
+ - "3e-3"
+ - --max_train_steps
+ - "100000"
+ - --checkpointing_steps
+ - "500"
+ - --validation_steps
+ - "100"
+ - --logging_steps
+ - "10"
+ - --validation_prompts
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+ - --output_dir
+ - ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+ - --mixed_precision
+ - bf16
+ - --lr_scheduler
+ - constant
+ - --lr_warmup_steps
+ - "0"
+ - --use_8bit_adam
+ - --gradient_checkpointing
+ - --min_masking_rate
+ - "0.0"
+ - --cond_dropout_prob
+ - "0.0"
+ - --split_vae_encode
+ - "1"
+ - --allow_tf32
+ - --seed
+ - "42"
+ - --report_to
+ - wandb
+ codePath: train/train_mei_video.py
+ codePathLocal: train/train_mei_video.py
+ cpu_count: 48
+ cpu_count_logical: 96
+ cudaVersion: "12.8"
+ disk:
+ /:
+ total: "16650112278528"
+ used: "15576128081920"
+ email: jinbin5bai@gmail.com
+ executable: /home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10
+ git:
+ commit: 6819d374ef1b86bdedad373aab1121a89687e5cf
+ remote: https://github.com/viiika/Meissonic.git
+ gpu: NVIDIA A100-SXM4-40GB
+ gpu_count: 8
+ gpu_nvidia:
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-71102f28-cd17-57e7-6181-120bf743d23d
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-303ab142-3206-9a14-c758-58ab97d7510e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-efb2d1fc-1eed-653d-ed51-5273085154ba
+ host: ip-172-31-91-136
+ memory:
+ total: "1204521451520"
+ os: Linux-6.8.0-1027-aws-x86_64-with-glibc2.35
+ program: /mnt/Meissonic/train/train_mei_video.py
+ python: CPython 3.10.19
+ root: /mnt/Meissonic
+ startedAt: "2025-12-29T08:17:52.346764Z"
+ writerId: zkex3lp5g2z9pkl87qlphzoninilz5fa
+ m: []
+ python_version: 3.10.19
+ t:
+ "1":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "2":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "4": 3.10.19
+ "5": 0.23.1
+ "6": 4.57.3
+ "12": 0.23.1
+ "13": linux-x86_64
+adam_beta1:
+ value: 0.9
+adam_beta2:
+ value: 0.999
+adam_epsilon:
+ value: 1e-08
+adam_weight_decay:
+ value: 0.01
+allow_tf32:
+ value: true
+checkpointing_steps:
+ value: 500
+checkpoints_total_limit:
+ value: null
+cond_dropout_prob:
+ value: 0
+dataloader_num_workers:
+ value: 8
+dataloader_prefetch_factor:
+ value: 2
+ema_decay:
+ value: 0.9999
+ema_update_after_step:
+ value: 0
+empty_embeds_path:
+ value: null
+features_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_256_256_full_set
+freeze_wan_backbone:
+ value: false
+gradient_accumulation_steps:
+ value: 4
+gradient_checkpointing:
+ value: true
+image_key:
+ value: null
+instance_data_dataset:
+ value: null
+instance_data_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+instance_data_image:
+ value: null
+instance_dataset:
+ value: OpenVid1MDataset
+learning_rate:
+ value: 0.003
+logging_dir:
+ value: logs
+logging_steps:
+ value: 10
+lora_alpha:
+ value: 32
+lora_r:
+ value: 16
+lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+lr_scheduler:
+ value: constant
+lr_warmup_steps:
+ value: 0
+max_grad_norm:
+ value: 50
+max_train_steps:
+ value: 100000
+min_masking_rate:
+ value: 0
+mixed_precision:
+ value: bf16
+num_frames:
+ value: 17
+output_dir:
+ value: ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+pretrained_model_name_or_path:
+ value: dummy
+prompt_key:
+ value: null
+prompt_prefix:
+ value: null
+report_to:
+ value: wandb
+resolution:
+ value: 512
+resume_from_checkpoint:
+ value: null
+revision:
+ value: null
+scale_lr:
+ value: false
+seed:
+ value: 42
+split_vae_encode:
+ value: 1
+text_encoder_architecture:
+ value: umt5-xxl
+text_encoder_lora_alpha:
+ value: 32
+text_encoder_lora_r:
+ value: 16
+text_encoder_lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+text_encoder_use_lora:
+ value: false
+train_batch_size:
+ value: 2
+train_text_encoder:
+ value: false
+training_from_scratch:
+ value: true
+use_8bit_adam:
+ value: true
+use_ema:
+ value: false
+use_lora:
+ value: false
+use_precomputed_features:
+ value: false
+use_precomputed_video_only:
+ value: true
+validation_prompts:
+ value:
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+validation_steps:
+ value: 100
+variant:
+ value: null
+video_height:
+ value: 256
+video_tokenizer_model_id:
+ value: Cosmos-0.1-Tokenizer-DV4x8x8
+video_width:
+ value: 256
+wan_backbone_lr_ratio:
+ value: 0.2
+wan_pretrained_path:
+ value: /mnt/Wan2.1-T2V-1.3B
diff --git a/Meissonic/wandb/run-20251229_081752-78ojckdj/files/requirements.txt b/Meissonic/wandb/run-20251229_081752-78ojckdj/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1d01ecad871b6b3baba9900a3b3d370e9205a61d
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_081752-78ojckdj/files/requirements.txt
@@ -0,0 +1,151 @@
+ImageIO==2.37.2
+typing-inspection==0.4.2
+av==16.0.1
+dill==0.4.0
+matplotlib==3.10.7
+xxhash==3.6.0
+tap==0.2
+mc_bin_client==1.0.1
+exceptiongroup==1.3.1
+cycler==0.12.1
+einops==0.8.1
+opencv-python==4.12.0.88
+scikit-image==0.25.2
+dashscope==1.25.2
+charset-normalizer==3.4.4
+filelock==3.19.1
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+lazy_loader==0.4
+kiwisolver==1.4.9
+Flask==3.1.2
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+setuptools==80.9.0
+websocket-client==1.9.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+itsdangerous==2.2.0
+pydantic_core==2.41.5
+matrix-game-2.0==0.0.1
+wsproto==1.3.2
+psutil==7.1.3
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+nvidia-cusparselt-cu12==0.7.1
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+datasets==4.4.1
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+lightning==2.6.0
+Flask-SocketIO==5.5.1
+torchvision==0.24.1
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+fonttools==4.61.0
+open_clip_torch==3.2.0
+flash_attn==2.8.3
+mdurl==0.1.2
+pandas==2.3.3
+modelscope==1.32.0
+ftfy==6.3.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+beartype==0.22.8
+dominate==2.9.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+bitsandbytes==0.48.2
+lightning-utilities==0.15.2
+easydict==1.13
+networkx==3.3
+wheel==0.45.1
+timm==1.0.22
+pyparsing==3.2.5
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+pfzy==0.3.4
+httpcore==1.0.9
+multidict==6.7.0
+pycparser==2.23
+regex==2025.11.3
+importlib_metadata==8.7.0
+Werkzeug==3.1.4
+antlr4-python3-runtime==4.9.3
+sentry-sdk==2.46.0
+urllib3==2.5.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+cryptography==46.0.3
+omegaconf==2.3.0
+cffi==2.0.0
+packaging==25.0
+inquirerpy==0.3.4
+aiosignal==1.4.0
+MarkupSafe==2.1.5
+nvidia-cuda-nvrtc-cu12==12.8.93
+tzdata==2025.2
+decord==0.6.0
+async-timeout==5.0.1
+sympy==1.14.0
+numpy==2.1.2
+torch==2.9.1
+diffusers==0.35.2
+nvidia-cuda-cupti-cu12==12.8.90
+smmap==5.0.2
+tifffile==2025.5.10
+safetensors==0.7.0
+gitdb==4.0.12
+blinker==1.9.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+typer-slim==0.20.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+python-engineio==4.12.3
+lmdb==1.7.5
+nvidia-nvtx-cu12==12.8.90
+fsspec==2025.9.0
+markdown-it-py==4.0.0
+six==1.17.0
+platformdirs==4.5.0
+starlette==0.50.0
+scipy==1.15.3
+pycocotools==2.0.10
+accelerate==1.12.0
+zipp==3.23.0
+propcache==0.4.1
+bidict==0.23.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+simple-websocket==1.1.0
+nvidia-curand-cu12==10.3.9.90
+contourpy==1.3.2
+imageio-ffmpeg==0.6.0
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+prompt_toolkit==3.0.52
+pillow==11.3.0
+protobuf==6.33.1
+yarl==1.22.0
+clip==1.0
+nvidia-cudnn-cu12==9.10.2.21
+python-socketio==5.15.0
diff --git a/Meissonic/wandb/run-20251229_081752-78ojckdj/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_081752-78ojckdj/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..4958e288497f13b4c9e13f847231e3a89ccde134
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_081752-78ojckdj/files/wandb-metadata.json
@@ -0,0 +1,158 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.10.19",
+ "startedAt": "2025-12-29T08:17:52.346764Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_256_256_full_set",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "True",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "256",
+ "--video_width",
+ "256",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "2",
+ "--gradient_accumulation_steps",
+ "4",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15576128081920"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "zkex3lp5g2z9pkl87qlphzoninilz5fa"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_081752-78ojckdj/files/wandb-summary.json b/Meissonic/wandb/run-20251229_081752-78ojckdj/files/wandb-summary.json
new file mode 100644
index 0000000000000000000000000000000000000000..8afb95f49483c85658a334253ad61c5e4b5851ef
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_081752-78ojckdj/files/wandb-summary.json
@@ -0,0 +1 @@
+{"_wandb":{"runtime":2},"_runtime":2}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_081752-78ojckdj/run-78ojckdj.wandb b/Meissonic/wandb/run-20251229_081752-78ojckdj/run-78ojckdj.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..8ab95190308d1b59d9cddb8cce945b29b22d9057
Binary files /dev/null and b/Meissonic/wandb/run-20251229_081752-78ojckdj/run-78ojckdj.wandb differ
diff --git a/Meissonic/wandb/run-20251229_081959-tvb7bjux/files/requirements.txt b/Meissonic/wandb/run-20251229_081959-tvb7bjux/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1d01ecad871b6b3baba9900a3b3d370e9205a61d
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_081959-tvb7bjux/files/requirements.txt
@@ -0,0 +1,151 @@
+ImageIO==2.37.2
+typing-inspection==0.4.2
+av==16.0.1
+dill==0.4.0
+matplotlib==3.10.7
+xxhash==3.6.0
+tap==0.2
+mc_bin_client==1.0.1
+exceptiongroup==1.3.1
+cycler==0.12.1
+einops==0.8.1
+opencv-python==4.12.0.88
+scikit-image==0.25.2
+dashscope==1.25.2
+charset-normalizer==3.4.4
+filelock==3.19.1
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+lazy_loader==0.4
+kiwisolver==1.4.9
+Flask==3.1.2
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+setuptools==80.9.0
+websocket-client==1.9.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+itsdangerous==2.2.0
+pydantic_core==2.41.5
+matrix-game-2.0==0.0.1
+wsproto==1.3.2
+psutil==7.1.3
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+nvidia-cusparselt-cu12==0.7.1
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+datasets==4.4.1
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+lightning==2.6.0
+Flask-SocketIO==5.5.1
+torchvision==0.24.1
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+fonttools==4.61.0
+open_clip_torch==3.2.0
+flash_attn==2.8.3
+mdurl==0.1.2
+pandas==2.3.3
+modelscope==1.32.0
+ftfy==6.3.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+beartype==0.22.8
+dominate==2.9.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+bitsandbytes==0.48.2
+lightning-utilities==0.15.2
+easydict==1.13
+networkx==3.3
+wheel==0.45.1
+timm==1.0.22
+pyparsing==3.2.5
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+pfzy==0.3.4
+httpcore==1.0.9
+multidict==6.7.0
+pycparser==2.23
+regex==2025.11.3
+importlib_metadata==8.7.0
+Werkzeug==3.1.4
+antlr4-python3-runtime==4.9.3
+sentry-sdk==2.46.0
+urllib3==2.5.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+cryptography==46.0.3
+omegaconf==2.3.0
+cffi==2.0.0
+packaging==25.0
+inquirerpy==0.3.4
+aiosignal==1.4.0
+MarkupSafe==2.1.5
+nvidia-cuda-nvrtc-cu12==12.8.93
+tzdata==2025.2
+decord==0.6.0
+async-timeout==5.0.1
+sympy==1.14.0
+numpy==2.1.2
+torch==2.9.1
+diffusers==0.35.2
+nvidia-cuda-cupti-cu12==12.8.90
+smmap==5.0.2
+tifffile==2025.5.10
+safetensors==0.7.0
+gitdb==4.0.12
+blinker==1.9.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+typer-slim==0.20.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+python-engineio==4.12.3
+lmdb==1.7.5
+nvidia-nvtx-cu12==12.8.90
+fsspec==2025.9.0
+markdown-it-py==4.0.0
+six==1.17.0
+platformdirs==4.5.0
+starlette==0.50.0
+scipy==1.15.3
+pycocotools==2.0.10
+accelerate==1.12.0
+zipp==3.23.0
+propcache==0.4.1
+bidict==0.23.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+simple-websocket==1.1.0
+nvidia-curand-cu12==10.3.9.90
+contourpy==1.3.2
+imageio-ffmpeg==0.6.0
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+prompt_toolkit==3.0.52
+pillow==11.3.0
+protobuf==6.33.1
+yarl==1.22.0
+clip==1.0
+nvidia-cudnn-cu12==9.10.2.21
+python-socketio==5.15.0
diff --git a/Meissonic/wandb/run-20251229_081959-tvb7bjux/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_081959-tvb7bjux/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..cbab7264e1963cac0307a473cff1a120a918e76e
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_081959-tvb7bjux/files/wandb-metadata.json
@@ -0,0 +1,158 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.10.19",
+ "startedAt": "2025-12-29T08:19:59.376492Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_256_256_full_set",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "True",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "256",
+ "--video_width",
+ "256",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "2",
+ "--gradient_accumulation_steps",
+ "4",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15576128245760"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "orvixx3hp2wmdyc3p6d6x92y3xjs7gpu"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_081959-tvb7bjux/run-tvb7bjux.wandb b/Meissonic/wandb/run-20251229_081959-tvb7bjux/run-tvb7bjux.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..30977402b08eb894463b83d7de7a007255879a32
Binary files /dev/null and b/Meissonic/wandb/run-20251229_081959-tvb7bjux/run-tvb7bjux.wandb differ
diff --git a/Meissonic/wandb/run-20251229_082208-d5bens3y/files/config.yaml b/Meissonic/wandb/run-20251229_082208-d5bens3y/files/config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..83ac83bb3dc8ed70b65f8c7f7c45dd4f44fa5f22
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_082208-d5bens3y/files/config.yaml
@@ -0,0 +1,309 @@
+_wandb:
+ value:
+ cli_version: 0.23.1
+ e:
+ 0do11cxq84ks57gwyzlkq3oixg2jsfez:
+ args:
+ - --use_precomputed_video_only
+ - --features_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_256_256_full_set
+ - --text_encoder_architecture
+ - umt5-xxl
+ - --wan_pretrained_path
+ - /mnt/Wan2.1-T2V-1.3B
+ - --training_from_scratch
+ - "True"
+ - --pretrained_model_name_or_path
+ - dummy
+ - --wan_backbone_lr_ratio
+ - "0.2"
+ - --num_frames
+ - "17"
+ - --video_height
+ - "256"
+ - --video_width
+ - "256"
+ - --dataloader_num_workers
+ - "8"
+ - --video_tokenizer_model_id
+ - Cosmos-0.1-Tokenizer-DV4x8x8
+ - --instance_dataset
+ - OpenVid1MDataset
+ - --instance_data_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+ - --train_batch_size
+ - "2"
+ - --gradient_accumulation_steps
+ - "4"
+ - --learning_rate
+ - "3e-3"
+ - --max_train_steps
+ - "100000"
+ - --checkpointing_steps
+ - "500"
+ - --validation_steps
+ - "100"
+ - --logging_steps
+ - "10"
+ - --validation_prompts
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+ - --output_dir
+ - ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+ - --mixed_precision
+ - bf16
+ - --lr_scheduler
+ - constant
+ - --lr_warmup_steps
+ - "0"
+ - --use_8bit_adam
+ - --gradient_checkpointing
+ - --min_masking_rate
+ - "0.0"
+ - --cond_dropout_prob
+ - "0.0"
+ - --split_vae_encode
+ - "1"
+ - --allow_tf32
+ - --seed
+ - "42"
+ - --report_to
+ - wandb
+ codePath: train/train_mei_video.py
+ codePathLocal: train/train_mei_video.py
+ cpu_count: 48
+ cpu_count_logical: 96
+ cudaVersion: "12.8"
+ disk:
+ /:
+ total: "16650112278528"
+ used: "15576128356352"
+ email: jinbin5bai@gmail.com
+ executable: /home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10
+ git:
+ commit: 6819d374ef1b86bdedad373aab1121a89687e5cf
+ remote: https://github.com/viiika/Meissonic.git
+ gpu: NVIDIA A100-SXM4-40GB
+ gpu_count: 8
+ gpu_nvidia:
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-71102f28-cd17-57e7-6181-120bf743d23d
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-303ab142-3206-9a14-c758-58ab97d7510e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-efb2d1fc-1eed-653d-ed51-5273085154ba
+ host: ip-172-31-91-136
+ memory:
+ total: "1204521451520"
+ os: Linux-6.8.0-1027-aws-x86_64-with-glibc2.35
+ program: /mnt/Meissonic/train/train_mei_video.py
+ python: CPython 3.10.19
+ root: /mnt/Meissonic
+ startedAt: "2025-12-29T08:22:08.563808Z"
+ writerId: 0do11cxq84ks57gwyzlkq3oixg2jsfez
+ m: []
+ python_version: 3.10.19
+ t:
+ "1":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "2":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "4": 3.10.19
+ "5": 0.23.1
+ "6": 4.57.3
+ "12": 0.23.1
+ "13": linux-x86_64
+adam_beta1:
+ value: 0.9
+adam_beta2:
+ value: 0.999
+adam_epsilon:
+ value: 1e-08
+adam_weight_decay:
+ value: 0.01
+allow_tf32:
+ value: true
+checkpointing_steps:
+ value: 500
+checkpoints_total_limit:
+ value: null
+cond_dropout_prob:
+ value: 0
+dataloader_num_workers:
+ value: 8
+dataloader_prefetch_factor:
+ value: 2
+ema_decay:
+ value: 0.9999
+ema_update_after_step:
+ value: 0
+empty_embeds_path:
+ value: null
+features_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_256_256_full_set
+freeze_wan_backbone:
+ value: false
+gradient_accumulation_steps:
+ value: 4
+gradient_checkpointing:
+ value: true
+image_key:
+ value: null
+instance_data_dataset:
+ value: null
+instance_data_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+instance_data_image:
+ value: null
+instance_dataset:
+ value: OpenVid1MDataset
+learning_rate:
+ value: 0.003
+logging_dir:
+ value: logs
+logging_steps:
+ value: 10
+lora_alpha:
+ value: 32
+lora_r:
+ value: 16
+lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+lr_scheduler:
+ value: constant
+lr_warmup_steps:
+ value: 0
+max_grad_norm:
+ value: 50
+max_train_steps:
+ value: 100000
+min_masking_rate:
+ value: 0
+mixed_precision:
+ value: bf16
+num_frames:
+ value: 17
+output_dir:
+ value: ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+pretrained_model_name_or_path:
+ value: dummy
+prompt_key:
+ value: null
+prompt_prefix:
+ value: null
+report_to:
+ value: wandb
+resolution:
+ value: 512
+resume_from_checkpoint:
+ value: null
+revision:
+ value: null
+scale_lr:
+ value: false
+seed:
+ value: 42
+split_vae_encode:
+ value: 1
+text_encoder_architecture:
+ value: umt5-xxl
+text_encoder_lora_alpha:
+ value: 32
+text_encoder_lora_r:
+ value: 16
+text_encoder_lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+text_encoder_use_lora:
+ value: false
+train_batch_size:
+ value: 2
+train_text_encoder:
+ value: false
+training_from_scratch:
+ value: true
+use_8bit_adam:
+ value: true
+use_ema:
+ value: false
+use_lora:
+ value: false
+use_precomputed_features:
+ value: false
+use_precomputed_video_only:
+ value: true
+validation_prompts:
+ value:
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+validation_steps:
+ value: 100
+variant:
+ value: null
+video_height:
+ value: 256
+video_tokenizer_model_id:
+ value: Cosmos-0.1-Tokenizer-DV4x8x8
+video_width:
+ value: 256
+wan_backbone_lr_ratio:
+ value: 0.2
+wan_pretrained_path:
+ value: /mnt/Wan2.1-T2V-1.3B
diff --git a/Meissonic/wandb/run-20251229_082208-d5bens3y/files/requirements.txt b/Meissonic/wandb/run-20251229_082208-d5bens3y/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1d01ecad871b6b3baba9900a3b3d370e9205a61d
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_082208-d5bens3y/files/requirements.txt
@@ -0,0 +1,151 @@
+ImageIO==2.37.2
+typing-inspection==0.4.2
+av==16.0.1
+dill==0.4.0
+matplotlib==3.10.7
+xxhash==3.6.0
+tap==0.2
+mc_bin_client==1.0.1
+exceptiongroup==1.3.1
+cycler==0.12.1
+einops==0.8.1
+opencv-python==4.12.0.88
+scikit-image==0.25.2
+dashscope==1.25.2
+charset-normalizer==3.4.4
+filelock==3.19.1
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+lazy_loader==0.4
+kiwisolver==1.4.9
+Flask==3.1.2
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+setuptools==80.9.0
+websocket-client==1.9.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+itsdangerous==2.2.0
+pydantic_core==2.41.5
+matrix-game-2.0==0.0.1
+wsproto==1.3.2
+psutil==7.1.3
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+nvidia-cusparselt-cu12==0.7.1
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+datasets==4.4.1
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+lightning==2.6.0
+Flask-SocketIO==5.5.1
+torchvision==0.24.1
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+fonttools==4.61.0
+open_clip_torch==3.2.0
+flash_attn==2.8.3
+mdurl==0.1.2
+pandas==2.3.3
+modelscope==1.32.0
+ftfy==6.3.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+beartype==0.22.8
+dominate==2.9.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+bitsandbytes==0.48.2
+lightning-utilities==0.15.2
+easydict==1.13
+networkx==3.3
+wheel==0.45.1
+timm==1.0.22
+pyparsing==3.2.5
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+pfzy==0.3.4
+httpcore==1.0.9
+multidict==6.7.0
+pycparser==2.23
+regex==2025.11.3
+importlib_metadata==8.7.0
+Werkzeug==3.1.4
+antlr4-python3-runtime==4.9.3
+sentry-sdk==2.46.0
+urllib3==2.5.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+cryptography==46.0.3
+omegaconf==2.3.0
+cffi==2.0.0
+packaging==25.0
+inquirerpy==0.3.4
+aiosignal==1.4.0
+MarkupSafe==2.1.5
+nvidia-cuda-nvrtc-cu12==12.8.93
+tzdata==2025.2
+decord==0.6.0
+async-timeout==5.0.1
+sympy==1.14.0
+numpy==2.1.2
+torch==2.9.1
+diffusers==0.35.2
+nvidia-cuda-cupti-cu12==12.8.90
+smmap==5.0.2
+tifffile==2025.5.10
+safetensors==0.7.0
+gitdb==4.0.12
+blinker==1.9.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+typer-slim==0.20.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+python-engineio==4.12.3
+lmdb==1.7.5
+nvidia-nvtx-cu12==12.8.90
+fsspec==2025.9.0
+markdown-it-py==4.0.0
+six==1.17.0
+platformdirs==4.5.0
+starlette==0.50.0
+scipy==1.15.3
+pycocotools==2.0.10
+accelerate==1.12.0
+zipp==3.23.0
+propcache==0.4.1
+bidict==0.23.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+simple-websocket==1.1.0
+nvidia-curand-cu12==10.3.9.90
+contourpy==1.3.2
+imageio-ffmpeg==0.6.0
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+prompt_toolkit==3.0.52
+pillow==11.3.0
+protobuf==6.33.1
+yarl==1.22.0
+clip==1.0
+nvidia-cudnn-cu12==9.10.2.21
+python-socketio==5.15.0
diff --git a/Meissonic/wandb/run-20251229_082208-d5bens3y/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_082208-d5bens3y/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..629f2020affa57938d72beecd00033b7c63f3510
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_082208-d5bens3y/files/wandb-metadata.json
@@ -0,0 +1,158 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.10.19",
+ "startedAt": "2025-12-29T08:22:08.563808Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_256_256_full_set",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "True",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "256",
+ "--video_width",
+ "256",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "2",
+ "--gradient_accumulation_steps",
+ "4",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15576128356352"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "0do11cxq84ks57gwyzlkq3oixg2jsfez"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_082208-d5bens3y/files/wandb-summary.json b/Meissonic/wandb/run-20251229_082208-d5bens3y/files/wandb-summary.json
new file mode 100644
index 0000000000000000000000000000000000000000..984aba10a00cb8a981bbca1cbc3e577ebdef58cf
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_082208-d5bens3y/files/wandb-summary.json
@@ -0,0 +1 @@
+{"_runtime":60,"_wandb":{"runtime":60}}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_082208-d5bens3y/run-d5bens3y.wandb b/Meissonic/wandb/run-20251229_082208-d5bens3y/run-d5bens3y.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..e3764f07a11ce17505e4ee2ceeba126d8d4de410
Binary files /dev/null and b/Meissonic/wandb/run-20251229_082208-d5bens3y/run-d5bens3y.wandb differ
diff --git a/Meissonic/wandb/run-20251229_082348-xdcob8vv/files/config.yaml b/Meissonic/wandb/run-20251229_082348-xdcob8vv/files/config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..dc8bf399a27719e2e594a2004135aee6d3b1ffc8
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_082348-xdcob8vv/files/config.yaml
@@ -0,0 +1,309 @@
+_wandb:
+ value:
+ cli_version: 0.23.1
+ e:
+ cexoq7e8axe3b6nnx8tjxfhoa3q5j093:
+ args:
+ - --use_precomputed_video_only
+ - --features_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_256_256_full_set
+ - --text_encoder_architecture
+ - umt5-xxl
+ - --wan_pretrained_path
+ - /mnt/Wan2.1-T2V-1.3B
+ - --training_from_scratch
+ - "True"
+ - --pretrained_model_name_or_path
+ - dummy
+ - --wan_backbone_lr_ratio
+ - "0.2"
+ - --num_frames
+ - "17"
+ - --video_height
+ - "256"
+ - --video_width
+ - "256"
+ - --dataloader_num_workers
+ - "8"
+ - --video_tokenizer_model_id
+ - Cosmos-0.1-Tokenizer-DV4x8x8
+ - --instance_dataset
+ - OpenVid1MDataset
+ - --instance_data_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+ - --train_batch_size
+ - "1"
+ - --gradient_accumulation_steps
+ - "4"
+ - --learning_rate
+ - "3e-3"
+ - --max_train_steps
+ - "100000"
+ - --checkpointing_steps
+ - "500"
+ - --validation_steps
+ - "100"
+ - --logging_steps
+ - "10"
+ - --validation_prompts
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+ - --output_dir
+ - ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+ - --mixed_precision
+ - bf16
+ - --lr_scheduler
+ - constant
+ - --lr_warmup_steps
+ - "0"
+ - --use_8bit_adam
+ - --gradient_checkpointing
+ - --min_masking_rate
+ - "0.0"
+ - --cond_dropout_prob
+ - "0.0"
+ - --split_vae_encode
+ - "1"
+ - --allow_tf32
+ - --seed
+ - "42"
+ - --report_to
+ - wandb
+ codePath: train/train_mei_video.py
+ codePathLocal: train/train_mei_video.py
+ cpu_count: 48
+ cpu_count_logical: 96
+ cudaVersion: "12.8"
+ disk:
+ /:
+ total: "16650112278528"
+ used: "15576128708608"
+ email: jinbin5bai@gmail.com
+ executable: /home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10
+ git:
+ commit: 6819d374ef1b86bdedad373aab1121a89687e5cf
+ remote: https://github.com/viiika/Meissonic.git
+ gpu: NVIDIA A100-SXM4-40GB
+ gpu_count: 8
+ gpu_nvidia:
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-71102f28-cd17-57e7-6181-120bf743d23d
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-303ab142-3206-9a14-c758-58ab97d7510e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-efb2d1fc-1eed-653d-ed51-5273085154ba
+ host: ip-172-31-91-136
+ memory:
+ total: "1204521451520"
+ os: Linux-6.8.0-1027-aws-x86_64-with-glibc2.35
+ program: /mnt/Meissonic/train/train_mei_video.py
+ python: CPython 3.10.19
+ root: /mnt/Meissonic
+ startedAt: "2025-12-29T08:23:48.413885Z"
+ writerId: cexoq7e8axe3b6nnx8tjxfhoa3q5j093
+ m: []
+ python_version: 3.10.19
+ t:
+ "1":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "2":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "4": 3.10.19
+ "5": 0.23.1
+ "6": 4.57.3
+ "12": 0.23.1
+ "13": linux-x86_64
+adam_beta1:
+ value: 0.9
+adam_beta2:
+ value: 0.999
+adam_epsilon:
+ value: 1e-08
+adam_weight_decay:
+ value: 0.01
+allow_tf32:
+ value: true
+checkpointing_steps:
+ value: 500
+checkpoints_total_limit:
+ value: null
+cond_dropout_prob:
+ value: 0
+dataloader_num_workers:
+ value: 8
+dataloader_prefetch_factor:
+ value: 2
+ema_decay:
+ value: 0.9999
+ema_update_after_step:
+ value: 0
+empty_embeds_path:
+ value: null
+features_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_256_256_full_set
+freeze_wan_backbone:
+ value: false
+gradient_accumulation_steps:
+ value: 4
+gradient_checkpointing:
+ value: true
+image_key:
+ value: null
+instance_data_dataset:
+ value: null
+instance_data_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+instance_data_image:
+ value: null
+instance_dataset:
+ value: OpenVid1MDataset
+learning_rate:
+ value: 0.003
+logging_dir:
+ value: logs
+logging_steps:
+ value: 10
+lora_alpha:
+ value: 32
+lora_r:
+ value: 16
+lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+lr_scheduler:
+ value: constant
+lr_warmup_steps:
+ value: 0
+max_grad_norm:
+ value: 50
+max_train_steps:
+ value: 100000
+min_masking_rate:
+ value: 0
+mixed_precision:
+ value: bf16
+num_frames:
+ value: 17
+output_dir:
+ value: ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+pretrained_model_name_or_path:
+ value: dummy
+prompt_key:
+ value: null
+prompt_prefix:
+ value: null
+report_to:
+ value: wandb
+resolution:
+ value: 512
+resume_from_checkpoint:
+ value: null
+revision:
+ value: null
+scale_lr:
+ value: false
+seed:
+ value: 42
+split_vae_encode:
+ value: 1
+text_encoder_architecture:
+ value: umt5-xxl
+text_encoder_lora_alpha:
+ value: 32
+text_encoder_lora_r:
+ value: 16
+text_encoder_lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+text_encoder_use_lora:
+ value: false
+train_batch_size:
+ value: 1
+train_text_encoder:
+ value: false
+training_from_scratch:
+ value: true
+use_8bit_adam:
+ value: true
+use_ema:
+ value: false
+use_lora:
+ value: false
+use_precomputed_features:
+ value: false
+use_precomputed_video_only:
+ value: true
+validation_prompts:
+ value:
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+validation_steps:
+ value: 100
+variant:
+ value: null
+video_height:
+ value: 256
+video_tokenizer_model_id:
+ value: Cosmos-0.1-Tokenizer-DV4x8x8
+video_width:
+ value: 256
+wan_backbone_lr_ratio:
+ value: 0.2
+wan_pretrained_path:
+ value: /mnt/Wan2.1-T2V-1.3B
diff --git a/Meissonic/wandb/run-20251229_082348-xdcob8vv/files/requirements.txt b/Meissonic/wandb/run-20251229_082348-xdcob8vv/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1d01ecad871b6b3baba9900a3b3d370e9205a61d
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_082348-xdcob8vv/files/requirements.txt
@@ -0,0 +1,151 @@
+ImageIO==2.37.2
+typing-inspection==0.4.2
+av==16.0.1
+dill==0.4.0
+matplotlib==3.10.7
+xxhash==3.6.0
+tap==0.2
+mc_bin_client==1.0.1
+exceptiongroup==1.3.1
+cycler==0.12.1
+einops==0.8.1
+opencv-python==4.12.0.88
+scikit-image==0.25.2
+dashscope==1.25.2
+charset-normalizer==3.4.4
+filelock==3.19.1
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+lazy_loader==0.4
+kiwisolver==1.4.9
+Flask==3.1.2
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+setuptools==80.9.0
+websocket-client==1.9.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+itsdangerous==2.2.0
+pydantic_core==2.41.5
+matrix-game-2.0==0.0.1
+wsproto==1.3.2
+psutil==7.1.3
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+nvidia-cusparselt-cu12==0.7.1
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+datasets==4.4.1
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+lightning==2.6.0
+Flask-SocketIO==5.5.1
+torchvision==0.24.1
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+fonttools==4.61.0
+open_clip_torch==3.2.0
+flash_attn==2.8.3
+mdurl==0.1.2
+pandas==2.3.3
+modelscope==1.32.0
+ftfy==6.3.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+beartype==0.22.8
+dominate==2.9.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+bitsandbytes==0.48.2
+lightning-utilities==0.15.2
+easydict==1.13
+networkx==3.3
+wheel==0.45.1
+timm==1.0.22
+pyparsing==3.2.5
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+pfzy==0.3.4
+httpcore==1.0.9
+multidict==6.7.0
+pycparser==2.23
+regex==2025.11.3
+importlib_metadata==8.7.0
+Werkzeug==3.1.4
+antlr4-python3-runtime==4.9.3
+sentry-sdk==2.46.0
+urllib3==2.5.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+cryptography==46.0.3
+omegaconf==2.3.0
+cffi==2.0.0
+packaging==25.0
+inquirerpy==0.3.4
+aiosignal==1.4.0
+MarkupSafe==2.1.5
+nvidia-cuda-nvrtc-cu12==12.8.93
+tzdata==2025.2
+decord==0.6.0
+async-timeout==5.0.1
+sympy==1.14.0
+numpy==2.1.2
+torch==2.9.1
+diffusers==0.35.2
+nvidia-cuda-cupti-cu12==12.8.90
+smmap==5.0.2
+tifffile==2025.5.10
+safetensors==0.7.0
+gitdb==4.0.12
+blinker==1.9.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+typer-slim==0.20.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+python-engineio==4.12.3
+lmdb==1.7.5
+nvidia-nvtx-cu12==12.8.90
+fsspec==2025.9.0
+markdown-it-py==4.0.0
+six==1.17.0
+platformdirs==4.5.0
+starlette==0.50.0
+scipy==1.15.3
+pycocotools==2.0.10
+accelerate==1.12.0
+zipp==3.23.0
+propcache==0.4.1
+bidict==0.23.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+simple-websocket==1.1.0
+nvidia-curand-cu12==10.3.9.90
+contourpy==1.3.2
+imageio-ffmpeg==0.6.0
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+prompt_toolkit==3.0.52
+pillow==11.3.0
+protobuf==6.33.1
+yarl==1.22.0
+clip==1.0
+nvidia-cudnn-cu12==9.10.2.21
+python-socketio==5.15.0
diff --git a/Meissonic/wandb/run-20251229_082348-xdcob8vv/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_082348-xdcob8vv/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..2249996df9abedd0adab5693a42ae1d00676c3a2
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_082348-xdcob8vv/files/wandb-metadata.json
@@ -0,0 +1,158 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.10.19",
+ "startedAt": "2025-12-29T08:23:48.413885Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_256_256_full_set",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "True",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "256",
+ "--video_width",
+ "256",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "1",
+ "--gradient_accumulation_steps",
+ "4",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15576128708608"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "cexoq7e8axe3b6nnx8tjxfhoa3q5j093"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_082348-xdcob8vv/files/wandb-summary.json b/Meissonic/wandb/run-20251229_082348-xdcob8vv/files/wandb-summary.json
new file mode 100644
index 0000000000000000000000000000000000000000..1a94005933617ca388ce16c6cb55ce2f06fcf658
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_082348-xdcob8vv/files/wandb-summary.json
@@ -0,0 +1 @@
+{"_wandb":{"runtime":62},"_runtime":62}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_082348-xdcob8vv/run-xdcob8vv.wandb b/Meissonic/wandb/run-20251229_082348-xdcob8vv/run-xdcob8vv.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..56313eabe98e3346cf20df4f17de88a9d5afe74f
Binary files /dev/null and b/Meissonic/wandb/run-20251229_082348-xdcob8vv/run-xdcob8vv.wandb differ
diff --git a/Meissonic/wandb/run-20251229_082735-s2rbngfj/files/config.yaml b/Meissonic/wandb/run-20251229_082735-s2rbngfj/files/config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..3b6c59d67fcb113b4d1c2810c40b0d3784c2e98c
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_082735-s2rbngfj/files/config.yaml
@@ -0,0 +1,309 @@
+_wandb:
+ value:
+ cli_version: 0.23.1
+ e:
+ 23ej42n6czwqb5tfzy8zgnxhw87hp68m:
+ args:
+ - --use_precomputed_video_only
+ - --features_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+ - --text_encoder_architecture
+ - umt5-xxl
+ - --wan_pretrained_path
+ - /mnt/Wan2.1-T2V-1.3B
+ - --training_from_scratch
+ - "True"
+ - --pretrained_model_name_or_path
+ - dummy
+ - --wan_backbone_lr_ratio
+ - "0.2"
+ - --num_frames
+ - "17"
+ - --video_height
+ - "128"
+ - --video_width
+ - "128"
+ - --dataloader_num_workers
+ - "8"
+ - --video_tokenizer_model_id
+ - Cosmos-0.1-Tokenizer-DV4x8x8
+ - --instance_dataset
+ - OpenVid1MDataset
+ - --instance_data_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+ - --train_batch_size
+ - "1"
+ - --gradient_accumulation_steps
+ - "4"
+ - --learning_rate
+ - "3e-3"
+ - --max_train_steps
+ - "100000"
+ - --checkpointing_steps
+ - "500"
+ - --validation_steps
+ - "100"
+ - --logging_steps
+ - "10"
+ - --validation_prompts
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+ - --output_dir
+ - ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+ - --mixed_precision
+ - bf16
+ - --lr_scheduler
+ - constant
+ - --lr_warmup_steps
+ - "0"
+ - --use_8bit_adam
+ - --gradient_checkpointing
+ - --min_masking_rate
+ - "0.0"
+ - --cond_dropout_prob
+ - "0.0"
+ - --split_vae_encode
+ - "1"
+ - --allow_tf32
+ - --seed
+ - "42"
+ - --report_to
+ - wandb
+ codePath: train/train_mei_video.py
+ codePathLocal: train/train_mei_video.py
+ cpu_count: 48
+ cpu_count_logical: 96
+ cudaVersion: "12.8"
+ disk:
+ /:
+ total: "16650112278528"
+ used: "15576128917504"
+ email: jinbin5bai@gmail.com
+ executable: /home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10
+ git:
+ commit: 6819d374ef1b86bdedad373aab1121a89687e5cf
+ remote: https://github.com/viiika/Meissonic.git
+ gpu: NVIDIA A100-SXM4-40GB
+ gpu_count: 8
+ gpu_nvidia:
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-71102f28-cd17-57e7-6181-120bf743d23d
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-303ab142-3206-9a14-c758-58ab97d7510e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-efb2d1fc-1eed-653d-ed51-5273085154ba
+ host: ip-172-31-91-136
+ memory:
+ total: "1204521451520"
+ os: Linux-6.8.0-1027-aws-x86_64-with-glibc2.35
+ program: /mnt/Meissonic/train/train_mei_video.py
+ python: CPython 3.10.19
+ root: /mnt/Meissonic
+ startedAt: "2025-12-29T08:27:35.178318Z"
+ writerId: 23ej42n6czwqb5tfzy8zgnxhw87hp68m
+ m: []
+ python_version: 3.10.19
+ t:
+ "1":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "2":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "4": 3.10.19
+ "5": 0.23.1
+ "6": 4.57.3
+ "12": 0.23.1
+ "13": linux-x86_64
+adam_beta1:
+ value: 0.9
+adam_beta2:
+ value: 0.999
+adam_epsilon:
+ value: 1e-08
+adam_weight_decay:
+ value: 0.01
+allow_tf32:
+ value: true
+checkpointing_steps:
+ value: 500
+checkpoints_total_limit:
+ value: null
+cond_dropout_prob:
+ value: 0
+dataloader_num_workers:
+ value: 8
+dataloader_prefetch_factor:
+ value: 2
+ema_decay:
+ value: 0.9999
+ema_update_after_step:
+ value: 0
+empty_embeds_path:
+ value: null
+features_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+freeze_wan_backbone:
+ value: false
+gradient_accumulation_steps:
+ value: 4
+gradient_checkpointing:
+ value: true
+image_key:
+ value: null
+instance_data_dataset:
+ value: null
+instance_data_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+instance_data_image:
+ value: null
+instance_dataset:
+ value: OpenVid1MDataset
+learning_rate:
+ value: 0.003
+logging_dir:
+ value: logs
+logging_steps:
+ value: 10
+lora_alpha:
+ value: 32
+lora_r:
+ value: 16
+lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+lr_scheduler:
+ value: constant
+lr_warmup_steps:
+ value: 0
+max_grad_norm:
+ value: 50
+max_train_steps:
+ value: 100000
+min_masking_rate:
+ value: 0
+mixed_precision:
+ value: bf16
+num_frames:
+ value: 17
+output_dir:
+ value: ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+pretrained_model_name_or_path:
+ value: dummy
+prompt_key:
+ value: null
+prompt_prefix:
+ value: null
+report_to:
+ value: wandb
+resolution:
+ value: 512
+resume_from_checkpoint:
+ value: null
+revision:
+ value: null
+scale_lr:
+ value: false
+seed:
+ value: 42
+split_vae_encode:
+ value: 1
+text_encoder_architecture:
+ value: umt5-xxl
+text_encoder_lora_alpha:
+ value: 32
+text_encoder_lora_r:
+ value: 16
+text_encoder_lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+text_encoder_use_lora:
+ value: false
+train_batch_size:
+ value: 1
+train_text_encoder:
+ value: false
+training_from_scratch:
+ value: true
+use_8bit_adam:
+ value: true
+use_ema:
+ value: false
+use_lora:
+ value: false
+use_precomputed_features:
+ value: false
+use_precomputed_video_only:
+ value: true
+validation_prompts:
+ value:
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+validation_steps:
+ value: 100
+variant:
+ value: null
+video_height:
+ value: 128
+video_tokenizer_model_id:
+ value: Cosmos-0.1-Tokenizer-DV4x8x8
+video_width:
+ value: 128
+wan_backbone_lr_ratio:
+ value: 0.2
+wan_pretrained_path:
+ value: /mnt/Wan2.1-T2V-1.3B
diff --git a/Meissonic/wandb/run-20251229_082735-s2rbngfj/files/requirements.txt b/Meissonic/wandb/run-20251229_082735-s2rbngfj/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1d01ecad871b6b3baba9900a3b3d370e9205a61d
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_082735-s2rbngfj/files/requirements.txt
@@ -0,0 +1,151 @@
+ImageIO==2.37.2
+typing-inspection==0.4.2
+av==16.0.1
+dill==0.4.0
+matplotlib==3.10.7
+xxhash==3.6.0
+tap==0.2
+mc_bin_client==1.0.1
+exceptiongroup==1.3.1
+cycler==0.12.1
+einops==0.8.1
+opencv-python==4.12.0.88
+scikit-image==0.25.2
+dashscope==1.25.2
+charset-normalizer==3.4.4
+filelock==3.19.1
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+lazy_loader==0.4
+kiwisolver==1.4.9
+Flask==3.1.2
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+setuptools==80.9.0
+websocket-client==1.9.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+itsdangerous==2.2.0
+pydantic_core==2.41.5
+matrix-game-2.0==0.0.1
+wsproto==1.3.2
+psutil==7.1.3
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+nvidia-cusparselt-cu12==0.7.1
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+datasets==4.4.1
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+lightning==2.6.0
+Flask-SocketIO==5.5.1
+torchvision==0.24.1
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+fonttools==4.61.0
+open_clip_torch==3.2.0
+flash_attn==2.8.3
+mdurl==0.1.2
+pandas==2.3.3
+modelscope==1.32.0
+ftfy==6.3.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+beartype==0.22.8
+dominate==2.9.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+bitsandbytes==0.48.2
+lightning-utilities==0.15.2
+easydict==1.13
+networkx==3.3
+wheel==0.45.1
+timm==1.0.22
+pyparsing==3.2.5
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+pfzy==0.3.4
+httpcore==1.0.9
+multidict==6.7.0
+pycparser==2.23
+regex==2025.11.3
+importlib_metadata==8.7.0
+Werkzeug==3.1.4
+antlr4-python3-runtime==4.9.3
+sentry-sdk==2.46.0
+urllib3==2.5.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+cryptography==46.0.3
+omegaconf==2.3.0
+cffi==2.0.0
+packaging==25.0
+inquirerpy==0.3.4
+aiosignal==1.4.0
+MarkupSafe==2.1.5
+nvidia-cuda-nvrtc-cu12==12.8.93
+tzdata==2025.2
+decord==0.6.0
+async-timeout==5.0.1
+sympy==1.14.0
+numpy==2.1.2
+torch==2.9.1
+diffusers==0.35.2
+nvidia-cuda-cupti-cu12==12.8.90
+smmap==5.0.2
+tifffile==2025.5.10
+safetensors==0.7.0
+gitdb==4.0.12
+blinker==1.9.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+typer-slim==0.20.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+python-engineio==4.12.3
+lmdb==1.7.5
+nvidia-nvtx-cu12==12.8.90
+fsspec==2025.9.0
+markdown-it-py==4.0.0
+six==1.17.0
+platformdirs==4.5.0
+starlette==0.50.0
+scipy==1.15.3
+pycocotools==2.0.10
+accelerate==1.12.0
+zipp==3.23.0
+propcache==0.4.1
+bidict==0.23.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+simple-websocket==1.1.0
+nvidia-curand-cu12==10.3.9.90
+contourpy==1.3.2
+imageio-ffmpeg==0.6.0
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+prompt_toolkit==3.0.52
+pillow==11.3.0
+protobuf==6.33.1
+yarl==1.22.0
+clip==1.0
+nvidia-cudnn-cu12==9.10.2.21
+python-socketio==5.15.0
diff --git a/Meissonic/wandb/run-20251229_082735-s2rbngfj/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_082735-s2rbngfj/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..0483236ec49757cd80d41befac8336014e5d14fc
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_082735-s2rbngfj/files/wandb-metadata.json
@@ -0,0 +1,158 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.10.19",
+ "startedAt": "2025-12-29T08:27:35.178318Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "True",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "128",
+ "--video_width",
+ "128",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "1",
+ "--gradient_accumulation_steps",
+ "4",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15576128917504"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "23ej42n6czwqb5tfzy8zgnxhw87hp68m"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_082735-s2rbngfj/files/wandb-summary.json b/Meissonic/wandb/run-20251229_082735-s2rbngfj/files/wandb-summary.json
new file mode 100644
index 0000000000000000000000000000000000000000..519453af2693820c461935aa95c333853a63586f
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_082735-s2rbngfj/files/wandb-summary.json
@@ -0,0 +1 @@
+{"_runtime":37,"_wandb":{"runtime":37}}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_082735-s2rbngfj/run-s2rbngfj.wandb b/Meissonic/wandb/run-20251229_082735-s2rbngfj/run-s2rbngfj.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..ad1669cf07310bb482a499ee6b183ecbc3f2c670
Binary files /dev/null and b/Meissonic/wandb/run-20251229_082735-s2rbngfj/run-s2rbngfj.wandb differ
diff --git a/Meissonic/wandb/run-20251229_083018-js6dhqj8/files/config.yaml b/Meissonic/wandb/run-20251229_083018-js6dhqj8/files/config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..6d80db42f6ab6bdcfedf99f49498fab7e910b017
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_083018-js6dhqj8/files/config.yaml
@@ -0,0 +1,309 @@
+_wandb:
+ value:
+ cli_version: 0.23.1
+ e:
+ klckxk39jaoiyfq1gchmx3az6t8cjds2:
+ args:
+ - --use_precomputed_video_only
+ - --features_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+ - --text_encoder_architecture
+ - umt5-xxl
+ - --wan_pretrained_path
+ - /mnt/Wan2.1-T2V-1.3B
+ - --training_from_scratch
+ - "True"
+ - --pretrained_model_name_or_path
+ - dummy
+ - --wan_backbone_lr_ratio
+ - "0.2"
+ - --num_frames
+ - "17"
+ - --video_height
+ - "128"
+ - --video_width
+ - "128"
+ - --dataloader_num_workers
+ - "8"
+ - --video_tokenizer_model_id
+ - Cosmos-0.1-Tokenizer-DV4x8x8
+ - --instance_dataset
+ - OpenVid1MDataset
+ - --instance_data_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+ - --train_batch_size
+ - "1"
+ - --gradient_accumulation_steps
+ - "4"
+ - --learning_rate
+ - "3e-3"
+ - --max_train_steps
+ - "100000"
+ - --checkpointing_steps
+ - "500"
+ - --validation_steps
+ - "100"
+ - --logging_steps
+ - "10"
+ - --validation_prompts
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+ - --output_dir
+ - ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+ - --mixed_precision
+ - bf16
+ - --lr_scheduler
+ - constant
+ - --lr_warmup_steps
+ - "0"
+ - --use_8bit_adam
+ - --gradient_checkpointing
+ - --min_masking_rate
+ - "0.0"
+ - --cond_dropout_prob
+ - "0.0"
+ - --split_vae_encode
+ - "1"
+ - --allow_tf32
+ - --seed
+ - "42"
+ - --report_to
+ - wandb
+ codePath: train/train_mei_video.py
+ codePathLocal: train/train_mei_video.py
+ cpu_count: 48
+ cpu_count_logical: 96
+ cudaVersion: "12.8"
+ disk:
+ /:
+ total: "16650112278528"
+ used: "15576129069056"
+ email: jinbin5bai@gmail.com
+ executable: /home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10
+ git:
+ commit: 6819d374ef1b86bdedad373aab1121a89687e5cf
+ remote: https://github.com/viiika/Meissonic.git
+ gpu: NVIDIA A100-SXM4-40GB
+ gpu_count: 8
+ gpu_nvidia:
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-71102f28-cd17-57e7-6181-120bf743d23d
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-303ab142-3206-9a14-c758-58ab97d7510e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-efb2d1fc-1eed-653d-ed51-5273085154ba
+ host: ip-172-31-91-136
+ memory:
+ total: "1204521451520"
+ os: Linux-6.8.0-1027-aws-x86_64-with-glibc2.35
+ program: /mnt/Meissonic/train/train_mei_video.py
+ python: CPython 3.10.19
+ root: /mnt/Meissonic
+ startedAt: "2025-12-29T08:30:18.123071Z"
+ writerId: klckxk39jaoiyfq1gchmx3az6t8cjds2
+ m: []
+ python_version: 3.10.19
+ t:
+ "1":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "2":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "4": 3.10.19
+ "5": 0.23.1
+ "6": 4.57.3
+ "12": 0.23.1
+ "13": linux-x86_64
+adam_beta1:
+ value: 0.9
+adam_beta2:
+ value: 0.999
+adam_epsilon:
+ value: 1e-08
+adam_weight_decay:
+ value: 0.01
+allow_tf32:
+ value: true
+checkpointing_steps:
+ value: 500
+checkpoints_total_limit:
+ value: null
+cond_dropout_prob:
+ value: 0
+dataloader_num_workers:
+ value: 8
+dataloader_prefetch_factor:
+ value: 2
+ema_decay:
+ value: 0.9999
+ema_update_after_step:
+ value: 0
+empty_embeds_path:
+ value: null
+features_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+freeze_wan_backbone:
+ value: false
+gradient_accumulation_steps:
+ value: 4
+gradient_checkpointing:
+ value: true
+image_key:
+ value: null
+instance_data_dataset:
+ value: null
+instance_data_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+instance_data_image:
+ value: null
+instance_dataset:
+ value: OpenVid1MDataset
+learning_rate:
+ value: 0.003
+logging_dir:
+ value: logs
+logging_steps:
+ value: 10
+lora_alpha:
+ value: 32
+lora_r:
+ value: 16
+lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+lr_scheduler:
+ value: constant
+lr_warmup_steps:
+ value: 0
+max_grad_norm:
+ value: 50
+max_train_steps:
+ value: 100000
+min_masking_rate:
+ value: 0
+mixed_precision:
+ value: bf16
+num_frames:
+ value: 17
+output_dir:
+ value: ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+pretrained_model_name_or_path:
+ value: dummy
+prompt_key:
+ value: null
+prompt_prefix:
+ value: null
+report_to:
+ value: wandb
+resolution:
+ value: 512
+resume_from_checkpoint:
+ value: null
+revision:
+ value: null
+scale_lr:
+ value: false
+seed:
+ value: 42
+split_vae_encode:
+ value: 1
+text_encoder_architecture:
+ value: umt5-xxl
+text_encoder_lora_alpha:
+ value: 32
+text_encoder_lora_r:
+ value: 16
+text_encoder_lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+text_encoder_use_lora:
+ value: false
+train_batch_size:
+ value: 1
+train_text_encoder:
+ value: false
+training_from_scratch:
+ value: true
+use_8bit_adam:
+ value: true
+use_ema:
+ value: false
+use_lora:
+ value: false
+use_precomputed_features:
+ value: false
+use_precomputed_video_only:
+ value: true
+validation_prompts:
+ value:
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+validation_steps:
+ value: 100
+variant:
+ value: null
+video_height:
+ value: 128
+video_tokenizer_model_id:
+ value: Cosmos-0.1-Tokenizer-DV4x8x8
+video_width:
+ value: 128
+wan_backbone_lr_ratio:
+ value: 0.2
+wan_pretrained_path:
+ value: /mnt/Wan2.1-T2V-1.3B
diff --git a/Meissonic/wandb/run-20251229_083018-js6dhqj8/files/requirements.txt b/Meissonic/wandb/run-20251229_083018-js6dhqj8/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1d01ecad871b6b3baba9900a3b3d370e9205a61d
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_083018-js6dhqj8/files/requirements.txt
@@ -0,0 +1,151 @@
+ImageIO==2.37.2
+typing-inspection==0.4.2
+av==16.0.1
+dill==0.4.0
+matplotlib==3.10.7
+xxhash==3.6.0
+tap==0.2
+mc_bin_client==1.0.1
+exceptiongroup==1.3.1
+cycler==0.12.1
+einops==0.8.1
+opencv-python==4.12.0.88
+scikit-image==0.25.2
+dashscope==1.25.2
+charset-normalizer==3.4.4
+filelock==3.19.1
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+lazy_loader==0.4
+kiwisolver==1.4.9
+Flask==3.1.2
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+setuptools==80.9.0
+websocket-client==1.9.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+itsdangerous==2.2.0
+pydantic_core==2.41.5
+matrix-game-2.0==0.0.1
+wsproto==1.3.2
+psutil==7.1.3
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+nvidia-cusparselt-cu12==0.7.1
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+datasets==4.4.1
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+lightning==2.6.0
+Flask-SocketIO==5.5.1
+torchvision==0.24.1
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+fonttools==4.61.0
+open_clip_torch==3.2.0
+flash_attn==2.8.3
+mdurl==0.1.2
+pandas==2.3.3
+modelscope==1.32.0
+ftfy==6.3.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+beartype==0.22.8
+dominate==2.9.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+bitsandbytes==0.48.2
+lightning-utilities==0.15.2
+easydict==1.13
+networkx==3.3
+wheel==0.45.1
+timm==1.0.22
+pyparsing==3.2.5
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+pfzy==0.3.4
+httpcore==1.0.9
+multidict==6.7.0
+pycparser==2.23
+regex==2025.11.3
+importlib_metadata==8.7.0
+Werkzeug==3.1.4
+antlr4-python3-runtime==4.9.3
+sentry-sdk==2.46.0
+urllib3==2.5.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+cryptography==46.0.3
+omegaconf==2.3.0
+cffi==2.0.0
+packaging==25.0
+inquirerpy==0.3.4
+aiosignal==1.4.0
+MarkupSafe==2.1.5
+nvidia-cuda-nvrtc-cu12==12.8.93
+tzdata==2025.2
+decord==0.6.0
+async-timeout==5.0.1
+sympy==1.14.0
+numpy==2.1.2
+torch==2.9.1
+diffusers==0.35.2
+nvidia-cuda-cupti-cu12==12.8.90
+smmap==5.0.2
+tifffile==2025.5.10
+safetensors==0.7.0
+gitdb==4.0.12
+blinker==1.9.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+typer-slim==0.20.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+python-engineio==4.12.3
+lmdb==1.7.5
+nvidia-nvtx-cu12==12.8.90
+fsspec==2025.9.0
+markdown-it-py==4.0.0
+six==1.17.0
+platformdirs==4.5.0
+starlette==0.50.0
+scipy==1.15.3
+pycocotools==2.0.10
+accelerate==1.12.0
+zipp==3.23.0
+propcache==0.4.1
+bidict==0.23.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+simple-websocket==1.1.0
+nvidia-curand-cu12==10.3.9.90
+contourpy==1.3.2
+imageio-ffmpeg==0.6.0
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+prompt_toolkit==3.0.52
+pillow==11.3.0
+protobuf==6.33.1
+yarl==1.22.0
+clip==1.0
+nvidia-cudnn-cu12==9.10.2.21
+python-socketio==5.15.0
diff --git a/Meissonic/wandb/run-20251229_083018-js6dhqj8/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_083018-js6dhqj8/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..37640d405846139dc9a9019391cbd179f5cff95a
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_083018-js6dhqj8/files/wandb-metadata.json
@@ -0,0 +1,158 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.10.19",
+ "startedAt": "2025-12-29T08:30:18.123071Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "True",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "128",
+ "--video_width",
+ "128",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "1",
+ "--gradient_accumulation_steps",
+ "4",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15576129069056"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "klckxk39jaoiyfq1gchmx3az6t8cjds2"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_083018-js6dhqj8/files/wandb-summary.json b/Meissonic/wandb/run-20251229_083018-js6dhqj8/files/wandb-summary.json
new file mode 100644
index 0000000000000000000000000000000000000000..41c80f51acbd739213ed6fbdd4077790252bc00d
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_083018-js6dhqj8/files/wandb-summary.json
@@ -0,0 +1 @@
+{"_wandb":{"runtime":36},"_runtime":36}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_083018-js6dhqj8/run-js6dhqj8.wandb b/Meissonic/wandb/run-20251229_083018-js6dhqj8/run-js6dhqj8.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..6b7d3d07d85c21d65f998a8d598cf6d4de39666e
Binary files /dev/null and b/Meissonic/wandb/run-20251229_083018-js6dhqj8/run-js6dhqj8.wandb differ
diff --git a/Meissonic/wandb/run-20251229_083518-un2j6o0e/files/config.yaml b/Meissonic/wandb/run-20251229_083518-un2j6o0e/files/config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..b642f1da44fc7a50c30e259377583222dbb2c95c
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_083518-un2j6o0e/files/config.yaml
@@ -0,0 +1,309 @@
+_wandb:
+ value:
+ cli_version: 0.23.1
+ e:
+ cef545nzoydf3af57q8i0xeevxfhokhu:
+ args:
+ - --use_precomputed_video_only
+ - --features_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+ - --text_encoder_architecture
+ - umt5-xxl
+ - --wan_pretrained_path
+ - /mnt/Wan2.1-T2V-1.3B
+ - --training_from_scratch
+ - "True"
+ - --pretrained_model_name_or_path
+ - dummy
+ - --wan_backbone_lr_ratio
+ - "0.2"
+ - --num_frames
+ - "17"
+ - --video_height
+ - "128"
+ - --video_width
+ - "128"
+ - --dataloader_num_workers
+ - "8"
+ - --video_tokenizer_model_id
+ - Cosmos-0.1-Tokenizer-DV4x8x8
+ - --instance_dataset
+ - OpenVid1MDataset
+ - --instance_data_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+ - --train_batch_size
+ - "1"
+ - --gradient_accumulation_steps
+ - "4"
+ - --learning_rate
+ - "3e-3"
+ - --max_train_steps
+ - "100000"
+ - --checkpointing_steps
+ - "500"
+ - --validation_steps
+ - "100"
+ - --logging_steps
+ - "10"
+ - --validation_prompts
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+ - --output_dir
+ - ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+ - --mixed_precision
+ - bf16
+ - --lr_scheduler
+ - constant
+ - --lr_warmup_steps
+ - "0"
+ - --use_8bit_adam
+ - --gradient_checkpointing
+ - --min_masking_rate
+ - "0.0"
+ - --cond_dropout_prob
+ - "0.0"
+ - --split_vae_encode
+ - "1"
+ - --allow_tf32
+ - --seed
+ - "42"
+ - --report_to
+ - wandb
+ codePath: train/train_mei_video.py
+ codePathLocal: train/train_mei_video.py
+ cpu_count: 48
+ cpu_count_logical: 96
+ cudaVersion: "12.8"
+ disk:
+ /:
+ total: "16650112278528"
+ used: "15576129302528"
+ email: jinbin5bai@gmail.com
+ executable: /home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10
+ git:
+ commit: 6819d374ef1b86bdedad373aab1121a89687e5cf
+ remote: https://github.com/viiika/Meissonic.git
+ gpu: NVIDIA A100-SXM4-40GB
+ gpu_count: 8
+ gpu_nvidia:
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-71102f28-cd17-57e7-6181-120bf743d23d
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-303ab142-3206-9a14-c758-58ab97d7510e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-efb2d1fc-1eed-653d-ed51-5273085154ba
+ host: ip-172-31-91-136
+ memory:
+ total: "1204521451520"
+ os: Linux-6.8.0-1027-aws-x86_64-with-glibc2.35
+ program: /mnt/Meissonic/train/train_mei_video.py
+ python: CPython 3.10.19
+ root: /mnt/Meissonic
+ startedAt: "2025-12-29T08:35:18.601369Z"
+ writerId: cef545nzoydf3af57q8i0xeevxfhokhu
+ m: []
+ python_version: 3.10.19
+ t:
+ "1":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "2":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "4": 3.10.19
+ "5": 0.23.1
+ "6": 4.57.3
+ "12": 0.23.1
+ "13": linux-x86_64
+adam_beta1:
+ value: 0.9
+adam_beta2:
+ value: 0.999
+adam_epsilon:
+ value: 1e-08
+adam_weight_decay:
+ value: 0.01
+allow_tf32:
+ value: true
+checkpointing_steps:
+ value: 500
+checkpoints_total_limit:
+ value: null
+cond_dropout_prob:
+ value: 0
+dataloader_num_workers:
+ value: 8
+dataloader_prefetch_factor:
+ value: 2
+ema_decay:
+ value: 0.9999
+ema_update_after_step:
+ value: 0
+empty_embeds_path:
+ value: null
+features_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+freeze_wan_backbone:
+ value: false
+gradient_accumulation_steps:
+ value: 4
+gradient_checkpointing:
+ value: true
+image_key:
+ value: null
+instance_data_dataset:
+ value: null
+instance_data_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+instance_data_image:
+ value: null
+instance_dataset:
+ value: OpenVid1MDataset
+learning_rate:
+ value: 0.003
+logging_dir:
+ value: logs
+logging_steps:
+ value: 10
+lora_alpha:
+ value: 32
+lora_r:
+ value: 16
+lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+lr_scheduler:
+ value: constant
+lr_warmup_steps:
+ value: 0
+max_grad_norm:
+ value: 50
+max_train_steps:
+ value: 100000
+min_masking_rate:
+ value: 0
+mixed_precision:
+ value: bf16
+num_frames:
+ value: 17
+output_dir:
+ value: ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+pretrained_model_name_or_path:
+ value: dummy
+prompt_key:
+ value: null
+prompt_prefix:
+ value: null
+report_to:
+ value: wandb
+resolution:
+ value: 512
+resume_from_checkpoint:
+ value: null
+revision:
+ value: null
+scale_lr:
+ value: false
+seed:
+ value: 42
+split_vae_encode:
+ value: 1
+text_encoder_architecture:
+ value: umt5-xxl
+text_encoder_lora_alpha:
+ value: 32
+text_encoder_lora_r:
+ value: 16
+text_encoder_lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+text_encoder_use_lora:
+ value: false
+train_batch_size:
+ value: 1
+train_text_encoder:
+ value: false
+training_from_scratch:
+ value: true
+use_8bit_adam:
+ value: true
+use_ema:
+ value: false
+use_lora:
+ value: false
+use_precomputed_features:
+ value: false
+use_precomputed_video_only:
+ value: true
+validation_prompts:
+ value:
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+validation_steps:
+ value: 100
+variant:
+ value: null
+video_height:
+ value: 128
+video_tokenizer_model_id:
+ value: Cosmos-0.1-Tokenizer-DV4x8x8
+video_width:
+ value: 128
+wan_backbone_lr_ratio:
+ value: 0.2
+wan_pretrained_path:
+ value: /mnt/Wan2.1-T2V-1.3B
diff --git a/Meissonic/wandb/run-20251229_083518-un2j6o0e/files/requirements.txt b/Meissonic/wandb/run-20251229_083518-un2j6o0e/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1d01ecad871b6b3baba9900a3b3d370e9205a61d
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_083518-un2j6o0e/files/requirements.txt
@@ -0,0 +1,151 @@
+ImageIO==2.37.2
+typing-inspection==0.4.2
+av==16.0.1
+dill==0.4.0
+matplotlib==3.10.7
+xxhash==3.6.0
+tap==0.2
+mc_bin_client==1.0.1
+exceptiongroup==1.3.1
+cycler==0.12.1
+einops==0.8.1
+opencv-python==4.12.0.88
+scikit-image==0.25.2
+dashscope==1.25.2
+charset-normalizer==3.4.4
+filelock==3.19.1
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+lazy_loader==0.4
+kiwisolver==1.4.9
+Flask==3.1.2
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+setuptools==80.9.0
+websocket-client==1.9.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+itsdangerous==2.2.0
+pydantic_core==2.41.5
+matrix-game-2.0==0.0.1
+wsproto==1.3.2
+psutil==7.1.3
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+nvidia-cusparselt-cu12==0.7.1
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+datasets==4.4.1
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+lightning==2.6.0
+Flask-SocketIO==5.5.1
+torchvision==0.24.1
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+fonttools==4.61.0
+open_clip_torch==3.2.0
+flash_attn==2.8.3
+mdurl==0.1.2
+pandas==2.3.3
+modelscope==1.32.0
+ftfy==6.3.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+beartype==0.22.8
+dominate==2.9.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+bitsandbytes==0.48.2
+lightning-utilities==0.15.2
+easydict==1.13
+networkx==3.3
+wheel==0.45.1
+timm==1.0.22
+pyparsing==3.2.5
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+pfzy==0.3.4
+httpcore==1.0.9
+multidict==6.7.0
+pycparser==2.23
+regex==2025.11.3
+importlib_metadata==8.7.0
+Werkzeug==3.1.4
+antlr4-python3-runtime==4.9.3
+sentry-sdk==2.46.0
+urllib3==2.5.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+cryptography==46.0.3
+omegaconf==2.3.0
+cffi==2.0.0
+packaging==25.0
+inquirerpy==0.3.4
+aiosignal==1.4.0
+MarkupSafe==2.1.5
+nvidia-cuda-nvrtc-cu12==12.8.93
+tzdata==2025.2
+decord==0.6.0
+async-timeout==5.0.1
+sympy==1.14.0
+numpy==2.1.2
+torch==2.9.1
+diffusers==0.35.2
+nvidia-cuda-cupti-cu12==12.8.90
+smmap==5.0.2
+tifffile==2025.5.10
+safetensors==0.7.0
+gitdb==4.0.12
+blinker==1.9.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+typer-slim==0.20.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+python-engineio==4.12.3
+lmdb==1.7.5
+nvidia-nvtx-cu12==12.8.90
+fsspec==2025.9.0
+markdown-it-py==4.0.0
+six==1.17.0
+platformdirs==4.5.0
+starlette==0.50.0
+scipy==1.15.3
+pycocotools==2.0.10
+accelerate==1.12.0
+zipp==3.23.0
+propcache==0.4.1
+bidict==0.23.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+simple-websocket==1.1.0
+nvidia-curand-cu12==10.3.9.90
+contourpy==1.3.2
+imageio-ffmpeg==0.6.0
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+prompt_toolkit==3.0.52
+pillow==11.3.0
+protobuf==6.33.1
+yarl==1.22.0
+clip==1.0
+nvidia-cudnn-cu12==9.10.2.21
+python-socketio==5.15.0
diff --git a/Meissonic/wandb/run-20251229_083518-un2j6o0e/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_083518-un2j6o0e/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..98f3ea4d4d06b70d83ffb4db2f9ec9493dbb0e93
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_083518-un2j6o0e/files/wandb-metadata.json
@@ -0,0 +1,158 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.10.19",
+ "startedAt": "2025-12-29T08:35:18.601369Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "True",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "128",
+ "--video_width",
+ "128",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "1",
+ "--gradient_accumulation_steps",
+ "4",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15576129302528"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "cef545nzoydf3af57q8i0xeevxfhokhu"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_083518-un2j6o0e/files/wandb-summary.json b/Meissonic/wandb/run-20251229_083518-un2j6o0e/files/wandb-summary.json
new file mode 100644
index 0000000000000000000000000000000000000000..69e8a87828be38513e2aff6e058e4a9485d68a00
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_083518-un2j6o0e/files/wandb-summary.json
@@ -0,0 +1 @@
+{"_wandb":{"runtime":38},"_runtime":38}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_083518-un2j6o0e/run-un2j6o0e.wandb b/Meissonic/wandb/run-20251229_083518-un2j6o0e/run-un2j6o0e.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..6720d242b2c916012a326da076812428c930889a
Binary files /dev/null and b/Meissonic/wandb/run-20251229_083518-un2j6o0e/run-un2j6o0e.wandb differ
diff --git a/Meissonic/wandb/run-20251229_083628-ef52280g/files/config.yaml b/Meissonic/wandb/run-20251229_083628-ef52280g/files/config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..0cc8dffe2e5beec6f74d7d3b3d296ac375ceb440
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_083628-ef52280g/files/config.yaml
@@ -0,0 +1,309 @@
+_wandb:
+ value:
+ cli_version: 0.23.1
+ e:
+ gey79twhahqhiya7s0c4ou5v5d8w2hkj:
+ args:
+ - --use_precomputed_video_only
+ - --features_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+ - --text_encoder_architecture
+ - umt5-xxl
+ - --wan_pretrained_path
+ - /mnt/Wan2.1-T2V-1.3B
+ - --training_from_scratch
+ - "True"
+ - --pretrained_model_name_or_path
+ - dummy
+ - --wan_backbone_lr_ratio
+ - "0.2"
+ - --num_frames
+ - "17"
+ - --video_height
+ - "128"
+ - --video_width
+ - "128"
+ - --dataloader_num_workers
+ - "8"
+ - --video_tokenizer_model_id
+ - Cosmos-0.1-Tokenizer-DV4x8x8
+ - --instance_dataset
+ - OpenVid1MDataset
+ - --instance_data_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+ - --train_batch_size
+ - "1"
+ - --gradient_accumulation_steps
+ - "1"
+ - --learning_rate
+ - "3e-3"
+ - --max_train_steps
+ - "100000"
+ - --checkpointing_steps
+ - "500"
+ - --validation_steps
+ - "100"
+ - --logging_steps
+ - "10"
+ - --validation_prompts
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+ - --output_dir
+ - ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+ - --mixed_precision
+ - bf16
+ - --lr_scheduler
+ - constant
+ - --lr_warmup_steps
+ - "0"
+ - --use_8bit_adam
+ - --gradient_checkpointing
+ - --min_masking_rate
+ - "0.0"
+ - --cond_dropout_prob
+ - "0.0"
+ - --split_vae_encode
+ - "1"
+ - --allow_tf32
+ - --seed
+ - "42"
+ - --report_to
+ - wandb
+ codePath: train/train_mei_video.py
+ codePathLocal: train/train_mei_video.py
+ cpu_count: 48
+ cpu_count_logical: 96
+ cudaVersion: "12.8"
+ disk:
+ /:
+ total: "16650112278528"
+ used: "15576129425408"
+ email: jinbin5bai@gmail.com
+ executable: /home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10
+ git:
+ commit: 6819d374ef1b86bdedad373aab1121a89687e5cf
+ remote: https://github.com/viiika/Meissonic.git
+ gpu: NVIDIA A100-SXM4-40GB
+ gpu_count: 8
+ gpu_nvidia:
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-71102f28-cd17-57e7-6181-120bf743d23d
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-303ab142-3206-9a14-c758-58ab97d7510e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-efb2d1fc-1eed-653d-ed51-5273085154ba
+ host: ip-172-31-91-136
+ memory:
+ total: "1204521451520"
+ os: Linux-6.8.0-1027-aws-x86_64-with-glibc2.35
+ program: /mnt/Meissonic/train/train_mei_video.py
+ python: CPython 3.10.19
+ root: /mnt/Meissonic
+ startedAt: "2025-12-29T08:36:28.442156Z"
+ writerId: gey79twhahqhiya7s0c4ou5v5d8w2hkj
+ m: []
+ python_version: 3.10.19
+ t:
+ "1":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "2":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "4": 3.10.19
+ "5": 0.23.1
+ "6": 4.57.3
+ "12": 0.23.1
+ "13": linux-x86_64
+adam_beta1:
+ value: 0.9
+adam_beta2:
+ value: 0.999
+adam_epsilon:
+ value: 1e-08
+adam_weight_decay:
+ value: 0.01
+allow_tf32:
+ value: true
+checkpointing_steps:
+ value: 500
+checkpoints_total_limit:
+ value: null
+cond_dropout_prob:
+ value: 0
+dataloader_num_workers:
+ value: 8
+dataloader_prefetch_factor:
+ value: 2
+ema_decay:
+ value: 0.9999
+ema_update_after_step:
+ value: 0
+empty_embeds_path:
+ value: null
+features_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+freeze_wan_backbone:
+ value: false
+gradient_accumulation_steps:
+ value: 1
+gradient_checkpointing:
+ value: true
+image_key:
+ value: null
+instance_data_dataset:
+ value: null
+instance_data_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+instance_data_image:
+ value: null
+instance_dataset:
+ value: OpenVid1MDataset
+learning_rate:
+ value: 0.003
+logging_dir:
+ value: logs
+logging_steps:
+ value: 10
+lora_alpha:
+ value: 32
+lora_r:
+ value: 16
+lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+lr_scheduler:
+ value: constant
+lr_warmup_steps:
+ value: 0
+max_grad_norm:
+ value: 50
+max_train_steps:
+ value: 100000
+min_masking_rate:
+ value: 0
+mixed_precision:
+ value: bf16
+num_frames:
+ value: 17
+output_dir:
+ value: ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+pretrained_model_name_or_path:
+ value: dummy
+prompt_key:
+ value: null
+prompt_prefix:
+ value: null
+report_to:
+ value: wandb
+resolution:
+ value: 512
+resume_from_checkpoint:
+ value: null
+revision:
+ value: null
+scale_lr:
+ value: false
+seed:
+ value: 42
+split_vae_encode:
+ value: 1
+text_encoder_architecture:
+ value: umt5-xxl
+text_encoder_lora_alpha:
+ value: 32
+text_encoder_lora_r:
+ value: 16
+text_encoder_lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+text_encoder_use_lora:
+ value: false
+train_batch_size:
+ value: 1
+train_text_encoder:
+ value: false
+training_from_scratch:
+ value: true
+use_8bit_adam:
+ value: true
+use_ema:
+ value: false
+use_lora:
+ value: false
+use_precomputed_features:
+ value: false
+use_precomputed_video_only:
+ value: true
+validation_prompts:
+ value:
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+validation_steps:
+ value: 100
+variant:
+ value: null
+video_height:
+ value: 128
+video_tokenizer_model_id:
+ value: Cosmos-0.1-Tokenizer-DV4x8x8
+video_width:
+ value: 128
+wan_backbone_lr_ratio:
+ value: 0.2
+wan_pretrained_path:
+ value: /mnt/Wan2.1-T2V-1.3B
diff --git a/Meissonic/wandb/run-20251229_083628-ef52280g/files/requirements.txt b/Meissonic/wandb/run-20251229_083628-ef52280g/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1d01ecad871b6b3baba9900a3b3d370e9205a61d
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_083628-ef52280g/files/requirements.txt
@@ -0,0 +1,151 @@
+ImageIO==2.37.2
+typing-inspection==0.4.2
+av==16.0.1
+dill==0.4.0
+matplotlib==3.10.7
+xxhash==3.6.0
+tap==0.2
+mc_bin_client==1.0.1
+exceptiongroup==1.3.1
+cycler==0.12.1
+einops==0.8.1
+opencv-python==4.12.0.88
+scikit-image==0.25.2
+dashscope==1.25.2
+charset-normalizer==3.4.4
+filelock==3.19.1
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+lazy_loader==0.4
+kiwisolver==1.4.9
+Flask==3.1.2
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+setuptools==80.9.0
+websocket-client==1.9.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+itsdangerous==2.2.0
+pydantic_core==2.41.5
+matrix-game-2.0==0.0.1
+wsproto==1.3.2
+psutil==7.1.3
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+nvidia-cusparselt-cu12==0.7.1
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+datasets==4.4.1
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+lightning==2.6.0
+Flask-SocketIO==5.5.1
+torchvision==0.24.1
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+fonttools==4.61.0
+open_clip_torch==3.2.0
+flash_attn==2.8.3
+mdurl==0.1.2
+pandas==2.3.3
+modelscope==1.32.0
+ftfy==6.3.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+beartype==0.22.8
+dominate==2.9.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+bitsandbytes==0.48.2
+lightning-utilities==0.15.2
+easydict==1.13
+networkx==3.3
+wheel==0.45.1
+timm==1.0.22
+pyparsing==3.2.5
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+pfzy==0.3.4
+httpcore==1.0.9
+multidict==6.7.0
+pycparser==2.23
+regex==2025.11.3
+importlib_metadata==8.7.0
+Werkzeug==3.1.4
+antlr4-python3-runtime==4.9.3
+sentry-sdk==2.46.0
+urllib3==2.5.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+cryptography==46.0.3
+omegaconf==2.3.0
+cffi==2.0.0
+packaging==25.0
+inquirerpy==0.3.4
+aiosignal==1.4.0
+MarkupSafe==2.1.5
+nvidia-cuda-nvrtc-cu12==12.8.93
+tzdata==2025.2
+decord==0.6.0
+async-timeout==5.0.1
+sympy==1.14.0
+numpy==2.1.2
+torch==2.9.1
+diffusers==0.35.2
+nvidia-cuda-cupti-cu12==12.8.90
+smmap==5.0.2
+tifffile==2025.5.10
+safetensors==0.7.0
+gitdb==4.0.12
+blinker==1.9.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+typer-slim==0.20.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+python-engineio==4.12.3
+lmdb==1.7.5
+nvidia-nvtx-cu12==12.8.90
+fsspec==2025.9.0
+markdown-it-py==4.0.0
+six==1.17.0
+platformdirs==4.5.0
+starlette==0.50.0
+scipy==1.15.3
+pycocotools==2.0.10
+accelerate==1.12.0
+zipp==3.23.0
+propcache==0.4.1
+bidict==0.23.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+simple-websocket==1.1.0
+nvidia-curand-cu12==10.3.9.90
+contourpy==1.3.2
+imageio-ffmpeg==0.6.0
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+prompt_toolkit==3.0.52
+pillow==11.3.0
+protobuf==6.33.1
+yarl==1.22.0
+clip==1.0
+nvidia-cudnn-cu12==9.10.2.21
+python-socketio==5.15.0
diff --git a/Meissonic/wandb/run-20251229_083628-ef52280g/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_083628-ef52280g/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..b62b0211f87deb124625363da3efa86b4b3d5e6f
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_083628-ef52280g/files/wandb-metadata.json
@@ -0,0 +1,158 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.10.19",
+ "startedAt": "2025-12-29T08:36:28.442156Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "True",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "128",
+ "--video_width",
+ "128",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "1",
+ "--gradient_accumulation_steps",
+ "1",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15576129425408"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "gey79twhahqhiya7s0c4ou5v5d8w2hkj"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_083628-ef52280g/files/wandb-summary.json b/Meissonic/wandb/run-20251229_083628-ef52280g/files/wandb-summary.json
new file mode 100644
index 0000000000000000000000000000000000000000..3e4dda75148b11087b7ca4d383b6747633b42a57
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_083628-ef52280g/files/wandb-summary.json
@@ -0,0 +1 @@
+{"_wandb":{"runtime":37},"_runtime":37}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_083628-ef52280g/run-ef52280g.wandb b/Meissonic/wandb/run-20251229_083628-ef52280g/run-ef52280g.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..55a0a30f099a1b558b296bed8b39842c94835419
Binary files /dev/null and b/Meissonic/wandb/run-20251229_083628-ef52280g/run-ef52280g.wandb differ
diff --git a/Meissonic/wandb/run-20251229_083809-sx4rkgm3/files/config.yaml b/Meissonic/wandb/run-20251229_083809-sx4rkgm3/files/config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..b1b3064a79dd2d24ecc1e208146d580c39b8de4a
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_083809-sx4rkgm3/files/config.yaml
@@ -0,0 +1,309 @@
+_wandb:
+ value:
+ cli_version: 0.23.1
+ e:
+ dany56ckri37c4cswzaw5bcanyqvfkvb:
+ args:
+ - --use_precomputed_video_only
+ - --features_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+ - --text_encoder_architecture
+ - umt5-xxl
+ - --wan_pretrained_path
+ - /mnt/Wan2.1-T2V-1.3B
+ - --training_from_scratch
+ - "True"
+ - --pretrained_model_name_or_path
+ - dummy
+ - --wan_backbone_lr_ratio
+ - "0.2"
+ - --num_frames
+ - "17"
+ - --video_height
+ - "128"
+ - --video_width
+ - "128"
+ - --dataloader_num_workers
+ - "8"
+ - --video_tokenizer_model_id
+ - Cosmos-0.1-Tokenizer-DV4x8x8
+ - --instance_dataset
+ - OpenVid1MDataset
+ - --instance_data_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+ - --train_batch_size
+ - "1"
+ - --gradient_accumulation_steps
+ - "1"
+ - --learning_rate
+ - "3e-3"
+ - --max_train_steps
+ - "100000"
+ - --checkpointing_steps
+ - "500"
+ - --validation_steps
+ - "100"
+ - --logging_steps
+ - "10"
+ - --validation_prompts
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+ - --output_dir
+ - ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+ - --mixed_precision
+ - bf16
+ - --lr_scheduler
+ - constant
+ - --lr_warmup_steps
+ - "0"
+ - --use_8bit_adam
+ - --gradient_checkpointing
+ - --min_masking_rate
+ - "0.0"
+ - --cond_dropout_prob
+ - "0.0"
+ - --split_vae_encode
+ - "1"
+ - --allow_tf32
+ - --seed
+ - "42"
+ - --report_to
+ - wandb
+ codePath: train/train_mei_video.py
+ codePathLocal: train/train_mei_video.py
+ cpu_count: 48
+ cpu_count_logical: 96
+ cudaVersion: "12.8"
+ disk:
+ /:
+ total: "16650112278528"
+ used: "15576129572864"
+ email: jinbin5bai@gmail.com
+ executable: /home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10
+ git:
+ commit: 6819d374ef1b86bdedad373aab1121a89687e5cf
+ remote: https://github.com/viiika/Meissonic.git
+ gpu: NVIDIA A100-SXM4-40GB
+ gpu_count: 8
+ gpu_nvidia:
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-71102f28-cd17-57e7-6181-120bf743d23d
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-303ab142-3206-9a14-c758-58ab97d7510e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-efb2d1fc-1eed-653d-ed51-5273085154ba
+ host: ip-172-31-91-136
+ memory:
+ total: "1204521451520"
+ os: Linux-6.8.0-1027-aws-x86_64-with-glibc2.35
+ program: /mnt/Meissonic/train/train_mei_video.py
+ python: CPython 3.10.19
+ root: /mnt/Meissonic
+ startedAt: "2025-12-29T08:38:09.169418Z"
+ writerId: dany56ckri37c4cswzaw5bcanyqvfkvb
+ m: []
+ python_version: 3.10.19
+ t:
+ "1":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "2":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "4": 3.10.19
+ "5": 0.23.1
+ "6": 4.57.3
+ "12": 0.23.1
+ "13": linux-x86_64
+adam_beta1:
+ value: 0.9
+adam_beta2:
+ value: 0.999
+adam_epsilon:
+ value: 1e-08
+adam_weight_decay:
+ value: 0.01
+allow_tf32:
+ value: true
+checkpointing_steps:
+ value: 500
+checkpoints_total_limit:
+ value: null
+cond_dropout_prob:
+ value: 0
+dataloader_num_workers:
+ value: 8
+dataloader_prefetch_factor:
+ value: 2
+ema_decay:
+ value: 0.9999
+ema_update_after_step:
+ value: 0
+empty_embeds_path:
+ value: null
+features_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+freeze_wan_backbone:
+ value: false
+gradient_accumulation_steps:
+ value: 1
+gradient_checkpointing:
+ value: true
+image_key:
+ value: null
+instance_data_dataset:
+ value: null
+instance_data_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+instance_data_image:
+ value: null
+instance_dataset:
+ value: OpenVid1MDataset
+learning_rate:
+ value: 0.003
+logging_dir:
+ value: logs
+logging_steps:
+ value: 10
+lora_alpha:
+ value: 32
+lora_r:
+ value: 16
+lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+lr_scheduler:
+ value: constant
+lr_warmup_steps:
+ value: 0
+max_grad_norm:
+ value: 50
+max_train_steps:
+ value: 100000
+min_masking_rate:
+ value: 0
+mixed_precision:
+ value: bf16
+num_frames:
+ value: 17
+output_dir:
+ value: ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+pretrained_model_name_or_path:
+ value: dummy
+prompt_key:
+ value: null
+prompt_prefix:
+ value: null
+report_to:
+ value: wandb
+resolution:
+ value: 512
+resume_from_checkpoint:
+ value: null
+revision:
+ value: null
+scale_lr:
+ value: false
+seed:
+ value: 42
+split_vae_encode:
+ value: 1
+text_encoder_architecture:
+ value: umt5-xxl
+text_encoder_lora_alpha:
+ value: 32
+text_encoder_lora_r:
+ value: 16
+text_encoder_lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+text_encoder_use_lora:
+ value: false
+train_batch_size:
+ value: 1
+train_text_encoder:
+ value: false
+training_from_scratch:
+ value: true
+use_8bit_adam:
+ value: true
+use_ema:
+ value: false
+use_lora:
+ value: false
+use_precomputed_features:
+ value: false
+use_precomputed_video_only:
+ value: true
+validation_prompts:
+ value:
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+validation_steps:
+ value: 100
+variant:
+ value: null
+video_height:
+ value: 128
+video_tokenizer_model_id:
+ value: Cosmos-0.1-Tokenizer-DV4x8x8
+video_width:
+ value: 128
+wan_backbone_lr_ratio:
+ value: 0.2
+wan_pretrained_path:
+ value: /mnt/Wan2.1-T2V-1.3B
diff --git a/Meissonic/wandb/run-20251229_083809-sx4rkgm3/files/requirements.txt b/Meissonic/wandb/run-20251229_083809-sx4rkgm3/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1d01ecad871b6b3baba9900a3b3d370e9205a61d
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_083809-sx4rkgm3/files/requirements.txt
@@ -0,0 +1,151 @@
+ImageIO==2.37.2
+typing-inspection==0.4.2
+av==16.0.1
+dill==0.4.0
+matplotlib==3.10.7
+xxhash==3.6.0
+tap==0.2
+mc_bin_client==1.0.1
+exceptiongroup==1.3.1
+cycler==0.12.1
+einops==0.8.1
+opencv-python==4.12.0.88
+scikit-image==0.25.2
+dashscope==1.25.2
+charset-normalizer==3.4.4
+filelock==3.19.1
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+lazy_loader==0.4
+kiwisolver==1.4.9
+Flask==3.1.2
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+setuptools==80.9.0
+websocket-client==1.9.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+itsdangerous==2.2.0
+pydantic_core==2.41.5
+matrix-game-2.0==0.0.1
+wsproto==1.3.2
+psutil==7.1.3
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+nvidia-cusparselt-cu12==0.7.1
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+datasets==4.4.1
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+lightning==2.6.0
+Flask-SocketIO==5.5.1
+torchvision==0.24.1
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+fonttools==4.61.0
+open_clip_torch==3.2.0
+flash_attn==2.8.3
+mdurl==0.1.2
+pandas==2.3.3
+modelscope==1.32.0
+ftfy==6.3.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+beartype==0.22.8
+dominate==2.9.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+bitsandbytes==0.48.2
+lightning-utilities==0.15.2
+easydict==1.13
+networkx==3.3
+wheel==0.45.1
+timm==1.0.22
+pyparsing==3.2.5
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+pfzy==0.3.4
+httpcore==1.0.9
+multidict==6.7.0
+pycparser==2.23
+regex==2025.11.3
+importlib_metadata==8.7.0
+Werkzeug==3.1.4
+antlr4-python3-runtime==4.9.3
+sentry-sdk==2.46.0
+urllib3==2.5.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+cryptography==46.0.3
+omegaconf==2.3.0
+cffi==2.0.0
+packaging==25.0
+inquirerpy==0.3.4
+aiosignal==1.4.0
+MarkupSafe==2.1.5
+nvidia-cuda-nvrtc-cu12==12.8.93
+tzdata==2025.2
+decord==0.6.0
+async-timeout==5.0.1
+sympy==1.14.0
+numpy==2.1.2
+torch==2.9.1
+diffusers==0.35.2
+nvidia-cuda-cupti-cu12==12.8.90
+smmap==5.0.2
+tifffile==2025.5.10
+safetensors==0.7.0
+gitdb==4.0.12
+blinker==1.9.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+typer-slim==0.20.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+python-engineio==4.12.3
+lmdb==1.7.5
+nvidia-nvtx-cu12==12.8.90
+fsspec==2025.9.0
+markdown-it-py==4.0.0
+six==1.17.0
+platformdirs==4.5.0
+starlette==0.50.0
+scipy==1.15.3
+pycocotools==2.0.10
+accelerate==1.12.0
+zipp==3.23.0
+propcache==0.4.1
+bidict==0.23.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+simple-websocket==1.1.0
+nvidia-curand-cu12==10.3.9.90
+contourpy==1.3.2
+imageio-ffmpeg==0.6.0
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+prompt_toolkit==3.0.52
+pillow==11.3.0
+protobuf==6.33.1
+yarl==1.22.0
+clip==1.0
+nvidia-cudnn-cu12==9.10.2.21
+python-socketio==5.15.0
diff --git a/Meissonic/wandb/run-20251229_083809-sx4rkgm3/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_083809-sx4rkgm3/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..e46cf05b85a9a094dc3e10050351436836397067
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_083809-sx4rkgm3/files/wandb-metadata.json
@@ -0,0 +1,158 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.10.19",
+ "startedAt": "2025-12-29T08:38:09.169418Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "True",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "128",
+ "--video_width",
+ "128",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "1",
+ "--gradient_accumulation_steps",
+ "1",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15576129572864"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "dany56ckri37c4cswzaw5bcanyqvfkvb"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_083809-sx4rkgm3/files/wandb-summary.json b/Meissonic/wandb/run-20251229_083809-sx4rkgm3/files/wandb-summary.json
new file mode 100644
index 0000000000000000000000000000000000000000..1f6a8d6fba6992723d0f745edbf01bee18eb2022
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_083809-sx4rkgm3/files/wandb-summary.json
@@ -0,0 +1 @@
+{"_wandb":{"runtime":27},"_runtime":27}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_083809-sx4rkgm3/run-sx4rkgm3.wandb b/Meissonic/wandb/run-20251229_083809-sx4rkgm3/run-sx4rkgm3.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..6b355c465eadb796410037c31b001e42cc749b61
Binary files /dev/null and b/Meissonic/wandb/run-20251229_083809-sx4rkgm3/run-sx4rkgm3.wandb differ
diff --git a/Meissonic/wandb/run-20251229_084520-fzi541le/files/config.yaml b/Meissonic/wandb/run-20251229_084520-fzi541le/files/config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..5d69cb69f91a15be5f137b8a0be4c42dc70da3db
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_084520-fzi541le/files/config.yaml
@@ -0,0 +1,309 @@
+_wandb:
+ value:
+ cli_version: 0.23.1
+ e:
+ xdutcov5l8zjxveke9fy1hqfvc51vmd0:
+ args:
+ - --use_precomputed_video_only
+ - --features_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+ - --text_encoder_architecture
+ - umt5-xxl
+ - --wan_pretrained_path
+ - /mnt/Wan2.1-T2V-1.3B
+ - --training_from_scratch
+ - "True"
+ - --pretrained_model_name_or_path
+ - dummy
+ - --wan_backbone_lr_ratio
+ - "0.2"
+ - --num_frames
+ - "17"
+ - --video_height
+ - "128"
+ - --video_width
+ - "128"
+ - --dataloader_num_workers
+ - "8"
+ - --video_tokenizer_model_id
+ - Cosmos-0.1-Tokenizer-DV4x8x8
+ - --instance_dataset
+ - OpenVid1MDataset
+ - --instance_data_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+ - --train_batch_size
+ - "1"
+ - --gradient_accumulation_steps
+ - "1"
+ - --learning_rate
+ - "3e-3"
+ - --max_train_steps
+ - "100000"
+ - --checkpointing_steps
+ - "500"
+ - --validation_steps
+ - "100"
+ - --logging_steps
+ - "10"
+ - --validation_prompts
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+ - --output_dir
+ - ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+ - --mixed_precision
+ - bf16
+ - --lr_scheduler
+ - constant
+ - --lr_warmup_steps
+ - "0"
+ - --use_8bit_adam
+ - --gradient_checkpointing
+ - --min_masking_rate
+ - "0.0"
+ - --cond_dropout_prob
+ - "0.0"
+ - --split_vae_encode
+ - "1"
+ - --allow_tf32
+ - --seed
+ - "42"
+ - --report_to
+ - wandb
+ codePath: train/train_mei_video.py
+ codePathLocal: train/train_mei_video.py
+ cpu_count: 48
+ cpu_count_logical: 96
+ cudaVersion: "12.8"
+ disk:
+ /:
+ total: "16650112278528"
+ used: "15576129806336"
+ email: jinbin5bai@gmail.com
+ executable: /home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10
+ git:
+ commit: 6819d374ef1b86bdedad373aab1121a89687e5cf
+ remote: https://github.com/viiika/Meissonic.git
+ gpu: NVIDIA A100-SXM4-40GB
+ gpu_count: 8
+ gpu_nvidia:
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-71102f28-cd17-57e7-6181-120bf743d23d
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-303ab142-3206-9a14-c758-58ab97d7510e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-efb2d1fc-1eed-653d-ed51-5273085154ba
+ host: ip-172-31-91-136
+ memory:
+ total: "1204521451520"
+ os: Linux-6.8.0-1027-aws-x86_64-with-glibc2.35
+ program: /mnt/Meissonic/train/train_mei_video.py
+ python: CPython 3.10.19
+ root: /mnt/Meissonic
+ startedAt: "2025-12-29T08:45:20.618023Z"
+ writerId: xdutcov5l8zjxveke9fy1hqfvc51vmd0
+ m: []
+ python_version: 3.10.19
+ t:
+ "1":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "2":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "4": 3.10.19
+ "5": 0.23.1
+ "6": 4.57.3
+ "12": 0.23.1
+ "13": linux-x86_64
+adam_beta1:
+ value: 0.9
+adam_beta2:
+ value: 0.999
+adam_epsilon:
+ value: 1e-08
+adam_weight_decay:
+ value: 0.01
+allow_tf32:
+ value: true
+checkpointing_steps:
+ value: 500
+checkpoints_total_limit:
+ value: null
+cond_dropout_prob:
+ value: 0
+dataloader_num_workers:
+ value: 8
+dataloader_prefetch_factor:
+ value: 2
+ema_decay:
+ value: 0.9999
+ema_update_after_step:
+ value: 0
+empty_embeds_path:
+ value: null
+features_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+freeze_wan_backbone:
+ value: false
+gradient_accumulation_steps:
+ value: 1
+gradient_checkpointing:
+ value: true
+image_key:
+ value: null
+instance_data_dataset:
+ value: null
+instance_data_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+instance_data_image:
+ value: null
+instance_dataset:
+ value: OpenVid1MDataset
+learning_rate:
+ value: 0.003
+logging_dir:
+ value: logs
+logging_steps:
+ value: 10
+lora_alpha:
+ value: 32
+lora_r:
+ value: 16
+lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+lr_scheduler:
+ value: constant
+lr_warmup_steps:
+ value: 0
+max_grad_norm:
+ value: 50
+max_train_steps:
+ value: 100000
+min_masking_rate:
+ value: 0
+mixed_precision:
+ value: bf16
+num_frames:
+ value: 17
+output_dir:
+ value: ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+pretrained_model_name_or_path:
+ value: dummy
+prompt_key:
+ value: null
+prompt_prefix:
+ value: null
+report_to:
+ value: wandb
+resolution:
+ value: 512
+resume_from_checkpoint:
+ value: null
+revision:
+ value: null
+scale_lr:
+ value: false
+seed:
+ value: 42
+split_vae_encode:
+ value: 1
+text_encoder_architecture:
+ value: umt5-xxl
+text_encoder_lora_alpha:
+ value: 32
+text_encoder_lora_r:
+ value: 16
+text_encoder_lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+text_encoder_use_lora:
+ value: false
+train_batch_size:
+ value: 1
+train_text_encoder:
+ value: false
+training_from_scratch:
+ value: true
+use_8bit_adam:
+ value: true
+use_ema:
+ value: false
+use_lora:
+ value: false
+use_precomputed_features:
+ value: false
+use_precomputed_video_only:
+ value: true
+validation_prompts:
+ value:
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+validation_steps:
+ value: 100
+variant:
+ value: null
+video_height:
+ value: 128
+video_tokenizer_model_id:
+ value: Cosmos-0.1-Tokenizer-DV4x8x8
+video_width:
+ value: 128
+wan_backbone_lr_ratio:
+ value: 0.2
+wan_pretrained_path:
+ value: /mnt/Wan2.1-T2V-1.3B
diff --git a/Meissonic/wandb/run-20251229_084520-fzi541le/files/requirements.txt b/Meissonic/wandb/run-20251229_084520-fzi541le/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1d01ecad871b6b3baba9900a3b3d370e9205a61d
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_084520-fzi541le/files/requirements.txt
@@ -0,0 +1,151 @@
+ImageIO==2.37.2
+typing-inspection==0.4.2
+av==16.0.1
+dill==0.4.0
+matplotlib==3.10.7
+xxhash==3.6.0
+tap==0.2
+mc_bin_client==1.0.1
+exceptiongroup==1.3.1
+cycler==0.12.1
+einops==0.8.1
+opencv-python==4.12.0.88
+scikit-image==0.25.2
+dashscope==1.25.2
+charset-normalizer==3.4.4
+filelock==3.19.1
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+lazy_loader==0.4
+kiwisolver==1.4.9
+Flask==3.1.2
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+setuptools==80.9.0
+websocket-client==1.9.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+itsdangerous==2.2.0
+pydantic_core==2.41.5
+matrix-game-2.0==0.0.1
+wsproto==1.3.2
+psutil==7.1.3
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+nvidia-cusparselt-cu12==0.7.1
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+datasets==4.4.1
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+lightning==2.6.0
+Flask-SocketIO==5.5.1
+torchvision==0.24.1
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+fonttools==4.61.0
+open_clip_torch==3.2.0
+flash_attn==2.8.3
+mdurl==0.1.2
+pandas==2.3.3
+modelscope==1.32.0
+ftfy==6.3.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+beartype==0.22.8
+dominate==2.9.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+bitsandbytes==0.48.2
+lightning-utilities==0.15.2
+easydict==1.13
+networkx==3.3
+wheel==0.45.1
+timm==1.0.22
+pyparsing==3.2.5
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+pfzy==0.3.4
+httpcore==1.0.9
+multidict==6.7.0
+pycparser==2.23
+regex==2025.11.3
+importlib_metadata==8.7.0
+Werkzeug==3.1.4
+antlr4-python3-runtime==4.9.3
+sentry-sdk==2.46.0
+urllib3==2.5.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+cryptography==46.0.3
+omegaconf==2.3.0
+cffi==2.0.0
+packaging==25.0
+inquirerpy==0.3.4
+aiosignal==1.4.0
+MarkupSafe==2.1.5
+nvidia-cuda-nvrtc-cu12==12.8.93
+tzdata==2025.2
+decord==0.6.0
+async-timeout==5.0.1
+sympy==1.14.0
+numpy==2.1.2
+torch==2.9.1
+diffusers==0.35.2
+nvidia-cuda-cupti-cu12==12.8.90
+smmap==5.0.2
+tifffile==2025.5.10
+safetensors==0.7.0
+gitdb==4.0.12
+blinker==1.9.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+typer-slim==0.20.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+python-engineio==4.12.3
+lmdb==1.7.5
+nvidia-nvtx-cu12==12.8.90
+fsspec==2025.9.0
+markdown-it-py==4.0.0
+six==1.17.0
+platformdirs==4.5.0
+starlette==0.50.0
+scipy==1.15.3
+pycocotools==2.0.10
+accelerate==1.12.0
+zipp==3.23.0
+propcache==0.4.1
+bidict==0.23.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+simple-websocket==1.1.0
+nvidia-curand-cu12==10.3.9.90
+contourpy==1.3.2
+imageio-ffmpeg==0.6.0
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+prompt_toolkit==3.0.52
+pillow==11.3.0
+protobuf==6.33.1
+yarl==1.22.0
+clip==1.0
+nvidia-cudnn-cu12==9.10.2.21
+python-socketio==5.15.0
diff --git a/Meissonic/wandb/run-20251229_084520-fzi541le/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_084520-fzi541le/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..e96947c0ffcb23084869a37b4f6fb69e6fb7e97c
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_084520-fzi541le/files/wandb-metadata.json
@@ -0,0 +1,158 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.10.19",
+ "startedAt": "2025-12-29T08:45:20.618023Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "True",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "128",
+ "--video_width",
+ "128",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "1",
+ "--gradient_accumulation_steps",
+ "1",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15576129806336"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "xdutcov5l8zjxveke9fy1hqfvc51vmd0"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_084520-fzi541le/files/wandb-summary.json b/Meissonic/wandb/run-20251229_084520-fzi541le/files/wandb-summary.json
new file mode 100644
index 0000000000000000000000000000000000000000..1d476fc88692f959c7a899096787abbc21a55dbc
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_084520-fzi541le/files/wandb-summary.json
@@ -0,0 +1 @@
+{"_runtime":0,"_wandb":{"runtime":0}}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_084520-fzi541le/run-fzi541le.wandb b/Meissonic/wandb/run-20251229_084520-fzi541le/run-fzi541le.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..b32f3b80f5d0d3bc1cb4e6b00b46eba1bdc1cfbd
Binary files /dev/null and b/Meissonic/wandb/run-20251229_084520-fzi541le/run-fzi541le.wandb differ
diff --git a/Meissonic/wandb/run-20251229_084618-2l6k4nad/files/requirements.txt b/Meissonic/wandb/run-20251229_084618-2l6k4nad/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1d01ecad871b6b3baba9900a3b3d370e9205a61d
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_084618-2l6k4nad/files/requirements.txt
@@ -0,0 +1,151 @@
+ImageIO==2.37.2
+typing-inspection==0.4.2
+av==16.0.1
+dill==0.4.0
+matplotlib==3.10.7
+xxhash==3.6.0
+tap==0.2
+mc_bin_client==1.0.1
+exceptiongroup==1.3.1
+cycler==0.12.1
+einops==0.8.1
+opencv-python==4.12.0.88
+scikit-image==0.25.2
+dashscope==1.25.2
+charset-normalizer==3.4.4
+filelock==3.19.1
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+lazy_loader==0.4
+kiwisolver==1.4.9
+Flask==3.1.2
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+setuptools==80.9.0
+websocket-client==1.9.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+itsdangerous==2.2.0
+pydantic_core==2.41.5
+matrix-game-2.0==0.0.1
+wsproto==1.3.2
+psutil==7.1.3
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+nvidia-cusparselt-cu12==0.7.1
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+datasets==4.4.1
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+lightning==2.6.0
+Flask-SocketIO==5.5.1
+torchvision==0.24.1
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+fonttools==4.61.0
+open_clip_torch==3.2.0
+flash_attn==2.8.3
+mdurl==0.1.2
+pandas==2.3.3
+modelscope==1.32.0
+ftfy==6.3.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+beartype==0.22.8
+dominate==2.9.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+bitsandbytes==0.48.2
+lightning-utilities==0.15.2
+easydict==1.13
+networkx==3.3
+wheel==0.45.1
+timm==1.0.22
+pyparsing==3.2.5
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+pfzy==0.3.4
+httpcore==1.0.9
+multidict==6.7.0
+pycparser==2.23
+regex==2025.11.3
+importlib_metadata==8.7.0
+Werkzeug==3.1.4
+antlr4-python3-runtime==4.9.3
+sentry-sdk==2.46.0
+urllib3==2.5.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+cryptography==46.0.3
+omegaconf==2.3.0
+cffi==2.0.0
+packaging==25.0
+inquirerpy==0.3.4
+aiosignal==1.4.0
+MarkupSafe==2.1.5
+nvidia-cuda-nvrtc-cu12==12.8.93
+tzdata==2025.2
+decord==0.6.0
+async-timeout==5.0.1
+sympy==1.14.0
+numpy==2.1.2
+torch==2.9.1
+diffusers==0.35.2
+nvidia-cuda-cupti-cu12==12.8.90
+smmap==5.0.2
+tifffile==2025.5.10
+safetensors==0.7.0
+gitdb==4.0.12
+blinker==1.9.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+typer-slim==0.20.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+python-engineio==4.12.3
+lmdb==1.7.5
+nvidia-nvtx-cu12==12.8.90
+fsspec==2025.9.0
+markdown-it-py==4.0.0
+six==1.17.0
+platformdirs==4.5.0
+starlette==0.50.0
+scipy==1.15.3
+pycocotools==2.0.10
+accelerate==1.12.0
+zipp==3.23.0
+propcache==0.4.1
+bidict==0.23.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+simple-websocket==1.1.0
+nvidia-curand-cu12==10.3.9.90
+contourpy==1.3.2
+imageio-ffmpeg==0.6.0
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+prompt_toolkit==3.0.52
+pillow==11.3.0
+protobuf==6.33.1
+yarl==1.22.0
+clip==1.0
+nvidia-cudnn-cu12==9.10.2.21
+python-socketio==5.15.0
diff --git a/Meissonic/wandb/run-20251229_084618-2l6k4nad/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_084618-2l6k4nad/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..0b27e905b62e88c82c8f755038dd5300834d002d
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_084618-2l6k4nad/files/wandb-metadata.json
@@ -0,0 +1,158 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.10.19",
+ "startedAt": "2025-12-29T08:46:18.953823Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "True",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "128",
+ "--video_width",
+ "128",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "1",
+ "--gradient_accumulation_steps",
+ "1",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15576129921024"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "3to1koba6ycc7u4c1rtr7w2vmos56r8w"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_084618-2l6k4nad/run-2l6k4nad.wandb b/Meissonic/wandb/run-20251229_084618-2l6k4nad/run-2l6k4nad.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..30977402b08eb894463b83d7de7a007255879a32
Binary files /dev/null and b/Meissonic/wandb/run-20251229_084618-2l6k4nad/run-2l6k4nad.wandb differ
diff --git a/Meissonic/wandb/run-20251229_085306-rfncwmtb/files/requirements.txt b/Meissonic/wandb/run-20251229_085306-rfncwmtb/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1d01ecad871b6b3baba9900a3b3d370e9205a61d
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_085306-rfncwmtb/files/requirements.txt
@@ -0,0 +1,151 @@
+ImageIO==2.37.2
+typing-inspection==0.4.2
+av==16.0.1
+dill==0.4.0
+matplotlib==3.10.7
+xxhash==3.6.0
+tap==0.2
+mc_bin_client==1.0.1
+exceptiongroup==1.3.1
+cycler==0.12.1
+einops==0.8.1
+opencv-python==4.12.0.88
+scikit-image==0.25.2
+dashscope==1.25.2
+charset-normalizer==3.4.4
+filelock==3.19.1
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+lazy_loader==0.4
+kiwisolver==1.4.9
+Flask==3.1.2
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+setuptools==80.9.0
+websocket-client==1.9.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+itsdangerous==2.2.0
+pydantic_core==2.41.5
+matrix-game-2.0==0.0.1
+wsproto==1.3.2
+psutil==7.1.3
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+nvidia-cusparselt-cu12==0.7.1
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+datasets==4.4.1
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+lightning==2.6.0
+Flask-SocketIO==5.5.1
+torchvision==0.24.1
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+fonttools==4.61.0
+open_clip_torch==3.2.0
+flash_attn==2.8.3
+mdurl==0.1.2
+pandas==2.3.3
+modelscope==1.32.0
+ftfy==6.3.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+beartype==0.22.8
+dominate==2.9.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+bitsandbytes==0.48.2
+lightning-utilities==0.15.2
+easydict==1.13
+networkx==3.3
+wheel==0.45.1
+timm==1.0.22
+pyparsing==3.2.5
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+pfzy==0.3.4
+httpcore==1.0.9
+multidict==6.7.0
+pycparser==2.23
+regex==2025.11.3
+importlib_metadata==8.7.0
+Werkzeug==3.1.4
+antlr4-python3-runtime==4.9.3
+sentry-sdk==2.46.0
+urllib3==2.5.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+cryptography==46.0.3
+omegaconf==2.3.0
+cffi==2.0.0
+packaging==25.0
+inquirerpy==0.3.4
+aiosignal==1.4.0
+MarkupSafe==2.1.5
+nvidia-cuda-nvrtc-cu12==12.8.93
+tzdata==2025.2
+decord==0.6.0
+async-timeout==5.0.1
+sympy==1.14.0
+numpy==2.1.2
+torch==2.9.1
+diffusers==0.35.2
+nvidia-cuda-cupti-cu12==12.8.90
+smmap==5.0.2
+tifffile==2025.5.10
+safetensors==0.7.0
+gitdb==4.0.12
+blinker==1.9.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+typer-slim==0.20.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+python-engineio==4.12.3
+lmdb==1.7.5
+nvidia-nvtx-cu12==12.8.90
+fsspec==2025.9.0
+markdown-it-py==4.0.0
+six==1.17.0
+platformdirs==4.5.0
+starlette==0.50.0
+scipy==1.15.3
+pycocotools==2.0.10
+accelerate==1.12.0
+zipp==3.23.0
+propcache==0.4.1
+bidict==0.23.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+simple-websocket==1.1.0
+nvidia-curand-cu12==10.3.9.90
+contourpy==1.3.2
+imageio-ffmpeg==0.6.0
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+prompt_toolkit==3.0.52
+pillow==11.3.0
+protobuf==6.33.1
+yarl==1.22.0
+clip==1.0
+nvidia-cudnn-cu12==9.10.2.21
+python-socketio==5.15.0
diff --git a/Meissonic/wandb/run-20251229_085306-rfncwmtb/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_085306-rfncwmtb/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..13a2eae9a84e81ec403b77e924615cd19132f156
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_085306-rfncwmtb/files/wandb-metadata.json
@@ -0,0 +1,157 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.10.19",
+ "startedAt": "2025-12-29T08:53:06.346699Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "128",
+ "--video_width",
+ "128",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "1",
+ "--gradient_accumulation_steps",
+ "1",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15576130723840"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "088z9y3i6hpxv8yvsakvkmxfx7j2zud5"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_085306-rfncwmtb/run-rfncwmtb.wandb b/Meissonic/wandb/run-20251229_085306-rfncwmtb/run-rfncwmtb.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..30977402b08eb894463b83d7de7a007255879a32
Binary files /dev/null and b/Meissonic/wandb/run-20251229_085306-rfncwmtb/run-rfncwmtb.wandb differ
diff --git a/Meissonic/wandb/run-20251229_085719-9ezk0nqn/files/config.yaml b/Meissonic/wandb/run-20251229_085719-9ezk0nqn/files/config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..60d349d09563b6d635775d28509cca9b9b659a95
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_085719-9ezk0nqn/files/config.yaml
@@ -0,0 +1,308 @@
+_wandb:
+ value:
+ cli_version: 0.23.1
+ e:
+ 5phq83bjhmlxddgoilx7eg4m1m2huy0f:
+ args:
+ - --use_precomputed_video_only
+ - --features_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+ - --text_encoder_architecture
+ - umt5-xxl
+ - --wan_pretrained_path
+ - /mnt/Wan2.1-T2V-1.3B
+ - --training_from_scratch
+ - --pretrained_model_name_or_path
+ - dummy
+ - --wan_backbone_lr_ratio
+ - "0.2"
+ - --num_frames
+ - "17"
+ - --video_height
+ - "128"
+ - --video_width
+ - "128"
+ - --dataloader_num_workers
+ - "8"
+ - --video_tokenizer_model_id
+ - Cosmos-0.1-Tokenizer-DV4x8x8
+ - --instance_dataset
+ - OpenVid1MDataset
+ - --instance_data_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+ - --train_batch_size
+ - "1"
+ - --gradient_accumulation_steps
+ - "1"
+ - --learning_rate
+ - "3e-3"
+ - --max_train_steps
+ - "100000"
+ - --checkpointing_steps
+ - "500"
+ - --validation_steps
+ - "100"
+ - --logging_steps
+ - "10"
+ - --validation_prompts
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+ - --output_dir
+ - ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+ - --mixed_precision
+ - bf16
+ - --lr_scheduler
+ - constant
+ - --lr_warmup_steps
+ - "0"
+ - --use_8bit_adam
+ - --gradient_checkpointing
+ - --min_masking_rate
+ - "0.0"
+ - --cond_dropout_prob
+ - "0.0"
+ - --split_vae_encode
+ - "1"
+ - --allow_tf32
+ - --seed
+ - "42"
+ - --report_to
+ - wandb
+ codePath: train/train_mei_video.py
+ codePathLocal: train/train_mei_video.py
+ cpu_count: 48
+ cpu_count_logical: 96
+ cudaVersion: "12.8"
+ disk:
+ /:
+ total: "16650112278528"
+ used: "15576135294976"
+ email: jinbin5bai@gmail.com
+ executable: /home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10
+ git:
+ commit: 6819d374ef1b86bdedad373aab1121a89687e5cf
+ remote: https://github.com/viiika/Meissonic.git
+ gpu: NVIDIA A100-SXM4-40GB
+ gpu_count: 8
+ gpu_nvidia:
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-71102f28-cd17-57e7-6181-120bf743d23d
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-303ab142-3206-9a14-c758-58ab97d7510e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-efb2d1fc-1eed-653d-ed51-5273085154ba
+ host: ip-172-31-91-136
+ memory:
+ total: "1204521451520"
+ os: Linux-6.8.0-1027-aws-x86_64-with-glibc2.35
+ program: /mnt/Meissonic/train/train_mei_video.py
+ python: CPython 3.10.19
+ root: /mnt/Meissonic
+ startedAt: "2025-12-29T08:57:19.168946Z"
+ writerId: 5phq83bjhmlxddgoilx7eg4m1m2huy0f
+ m: []
+ python_version: 3.10.19
+ t:
+ "1":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "2":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "4": 3.10.19
+ "5": 0.23.1
+ "6": 4.57.3
+ "12": 0.23.1
+ "13": linux-x86_64
+adam_beta1:
+ value: 0.9
+adam_beta2:
+ value: 0.999
+adam_epsilon:
+ value: 1e-08
+adam_weight_decay:
+ value: 0.01
+allow_tf32:
+ value: true
+checkpointing_steps:
+ value: 500
+checkpoints_total_limit:
+ value: null
+cond_dropout_prob:
+ value: 0
+dataloader_num_workers:
+ value: 8
+dataloader_prefetch_factor:
+ value: 2
+ema_decay:
+ value: 0.9999
+ema_update_after_step:
+ value: 0
+empty_embeds_path:
+ value: null
+features_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+freeze_wan_backbone:
+ value: false
+gradient_accumulation_steps:
+ value: 1
+gradient_checkpointing:
+ value: true
+image_key:
+ value: null
+instance_data_dataset:
+ value: null
+instance_data_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+instance_data_image:
+ value: null
+instance_dataset:
+ value: OpenVid1MDataset
+learning_rate:
+ value: 0.003
+logging_dir:
+ value: logs
+logging_steps:
+ value: 10
+lora_alpha:
+ value: 32
+lora_r:
+ value: 16
+lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+lr_scheduler:
+ value: constant
+lr_warmup_steps:
+ value: 0
+max_grad_norm:
+ value: 50
+max_train_steps:
+ value: 100000
+min_masking_rate:
+ value: 0
+mixed_precision:
+ value: bf16
+num_frames:
+ value: 17
+output_dir:
+ value: ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+pretrained_model_name_or_path:
+ value: dummy
+prompt_key:
+ value: null
+prompt_prefix:
+ value: null
+report_to:
+ value: wandb
+resolution:
+ value: 512
+resume_from_checkpoint:
+ value: null
+revision:
+ value: null
+scale_lr:
+ value: false
+seed:
+ value: 42
+split_vae_encode:
+ value: 1
+text_encoder_architecture:
+ value: umt5-xxl
+text_encoder_lora_alpha:
+ value: 32
+text_encoder_lora_r:
+ value: 16
+text_encoder_lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+text_encoder_use_lora:
+ value: false
+train_batch_size:
+ value: 1
+train_text_encoder:
+ value: false
+training_from_scratch:
+ value: true
+use_8bit_adam:
+ value: true
+use_ema:
+ value: false
+use_lora:
+ value: false
+use_precomputed_features:
+ value: false
+use_precomputed_video_only:
+ value: true
+validation_prompts:
+ value:
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+validation_steps:
+ value: 100
+variant:
+ value: null
+video_height:
+ value: 128
+video_tokenizer_model_id:
+ value: Cosmos-0.1-Tokenizer-DV4x8x8
+video_width:
+ value: 128
+wan_backbone_lr_ratio:
+ value: 0.2
+wan_pretrained_path:
+ value: /mnt/Wan2.1-T2V-1.3B
diff --git a/Meissonic/wandb/run-20251229_085719-9ezk0nqn/files/requirements.txt b/Meissonic/wandb/run-20251229_085719-9ezk0nqn/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1d01ecad871b6b3baba9900a3b3d370e9205a61d
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_085719-9ezk0nqn/files/requirements.txt
@@ -0,0 +1,151 @@
+ImageIO==2.37.2
+typing-inspection==0.4.2
+av==16.0.1
+dill==0.4.0
+matplotlib==3.10.7
+xxhash==3.6.0
+tap==0.2
+mc_bin_client==1.0.1
+exceptiongroup==1.3.1
+cycler==0.12.1
+einops==0.8.1
+opencv-python==4.12.0.88
+scikit-image==0.25.2
+dashscope==1.25.2
+charset-normalizer==3.4.4
+filelock==3.19.1
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+lazy_loader==0.4
+kiwisolver==1.4.9
+Flask==3.1.2
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+setuptools==80.9.0
+websocket-client==1.9.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+itsdangerous==2.2.0
+pydantic_core==2.41.5
+matrix-game-2.0==0.0.1
+wsproto==1.3.2
+psutil==7.1.3
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+nvidia-cusparselt-cu12==0.7.1
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+datasets==4.4.1
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+lightning==2.6.0
+Flask-SocketIO==5.5.1
+torchvision==0.24.1
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+fonttools==4.61.0
+open_clip_torch==3.2.0
+flash_attn==2.8.3
+mdurl==0.1.2
+pandas==2.3.3
+modelscope==1.32.0
+ftfy==6.3.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+beartype==0.22.8
+dominate==2.9.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+bitsandbytes==0.48.2
+lightning-utilities==0.15.2
+easydict==1.13
+networkx==3.3
+wheel==0.45.1
+timm==1.0.22
+pyparsing==3.2.5
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+pfzy==0.3.4
+httpcore==1.0.9
+multidict==6.7.0
+pycparser==2.23
+regex==2025.11.3
+importlib_metadata==8.7.0
+Werkzeug==3.1.4
+antlr4-python3-runtime==4.9.3
+sentry-sdk==2.46.0
+urllib3==2.5.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+cryptography==46.0.3
+omegaconf==2.3.0
+cffi==2.0.0
+packaging==25.0
+inquirerpy==0.3.4
+aiosignal==1.4.0
+MarkupSafe==2.1.5
+nvidia-cuda-nvrtc-cu12==12.8.93
+tzdata==2025.2
+decord==0.6.0
+async-timeout==5.0.1
+sympy==1.14.0
+numpy==2.1.2
+torch==2.9.1
+diffusers==0.35.2
+nvidia-cuda-cupti-cu12==12.8.90
+smmap==5.0.2
+tifffile==2025.5.10
+safetensors==0.7.0
+gitdb==4.0.12
+blinker==1.9.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+typer-slim==0.20.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+python-engineio==4.12.3
+lmdb==1.7.5
+nvidia-nvtx-cu12==12.8.90
+fsspec==2025.9.0
+markdown-it-py==4.0.0
+six==1.17.0
+platformdirs==4.5.0
+starlette==0.50.0
+scipy==1.15.3
+pycocotools==2.0.10
+accelerate==1.12.0
+zipp==3.23.0
+propcache==0.4.1
+bidict==0.23.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+simple-websocket==1.1.0
+nvidia-curand-cu12==10.3.9.90
+contourpy==1.3.2
+imageio-ffmpeg==0.6.0
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+prompt_toolkit==3.0.52
+pillow==11.3.0
+protobuf==6.33.1
+yarl==1.22.0
+clip==1.0
+nvidia-cudnn-cu12==9.10.2.21
+python-socketio==5.15.0
diff --git a/Meissonic/wandb/run-20251229_085719-9ezk0nqn/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_085719-9ezk0nqn/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..f0db55891fc9df6df88114ab08e82830fa56d648
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_085719-9ezk0nqn/files/wandb-metadata.json
@@ -0,0 +1,157 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.10.19",
+ "startedAt": "2025-12-29T08:57:19.168946Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "128",
+ "--video_width",
+ "128",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "1",
+ "--gradient_accumulation_steps",
+ "1",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15576135294976"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "5phq83bjhmlxddgoilx7eg4m1m2huy0f"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_085719-9ezk0nqn/files/wandb-summary.json b/Meissonic/wandb/run-20251229_085719-9ezk0nqn/files/wandb-summary.json
new file mode 100644
index 0000000000000000000000000000000000000000..bf55868a8b18bdd0746b03671b2168432b325fcb
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_085719-9ezk0nqn/files/wandb-summary.json
@@ -0,0 +1 @@
+{"_runtime":28,"_wandb":{"runtime":28}}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_085719-9ezk0nqn/run-9ezk0nqn.wandb b/Meissonic/wandb/run-20251229_085719-9ezk0nqn/run-9ezk0nqn.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..a748d3f99d280cd6757c0efefc770c2973621396
Binary files /dev/null and b/Meissonic/wandb/run-20251229_085719-9ezk0nqn/run-9ezk0nqn.wandb differ
diff --git a/Meissonic/wandb/run-20251229_090331-alguiic1/files/config.yaml b/Meissonic/wandb/run-20251229_090331-alguiic1/files/config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..a929393378c2be8a560ce5a9df8de02ad7700504
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_090331-alguiic1/files/config.yaml
@@ -0,0 +1,308 @@
+_wandb:
+ value:
+ cli_version: 0.23.1
+ e:
+ emji49jn678obspjvv2gi3mj8a2k6sq4:
+ args:
+ - --use_precomputed_video_only
+ - --features_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+ - --text_encoder_architecture
+ - umt5-xxl
+ - --wan_pretrained_path
+ - /mnt/Wan2.1-T2V-1.3B
+ - --training_from_scratch
+ - --pretrained_model_name_or_path
+ - dummy
+ - --wan_backbone_lr_ratio
+ - "0.2"
+ - --num_frames
+ - "17"
+ - --video_height
+ - "128"
+ - --video_width
+ - "128"
+ - --dataloader_num_workers
+ - "8"
+ - --video_tokenizer_model_id
+ - Cosmos-0.1-Tokenizer-DV4x8x8
+ - --instance_dataset
+ - OpenVid1MDataset
+ - --instance_data_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+ - --train_batch_size
+ - "1"
+ - --gradient_accumulation_steps
+ - "1"
+ - --learning_rate
+ - "3e-3"
+ - --max_train_steps
+ - "100000"
+ - --checkpointing_steps
+ - "500"
+ - --validation_steps
+ - "100"
+ - --logging_steps
+ - "10"
+ - --validation_prompts
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+ - --output_dir
+ - ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+ - --mixed_precision
+ - bf16
+ - --lr_scheduler
+ - constant
+ - --lr_warmup_steps
+ - "0"
+ - --use_8bit_adam
+ - --gradient_checkpointing
+ - --min_masking_rate
+ - "0.0"
+ - --cond_dropout_prob
+ - "0.0"
+ - --split_vae_encode
+ - "1"
+ - --allow_tf32
+ - --seed
+ - "42"
+ - --report_to
+ - wandb
+ codePath: train/train_mei_video.py
+ codePathLocal: train/train_mei_video.py
+ cpu_count: 48
+ cpu_count_logical: 96
+ cudaVersion: "12.8"
+ disk:
+ /:
+ total: "16650112278528"
+ used: "15576135458816"
+ email: jinbin5bai@gmail.com
+ executable: /home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10
+ git:
+ commit: 6819d374ef1b86bdedad373aab1121a89687e5cf
+ remote: https://github.com/viiika/Meissonic.git
+ gpu: NVIDIA A100-SXM4-40GB
+ gpu_count: 8
+ gpu_nvidia:
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-71102f28-cd17-57e7-6181-120bf743d23d
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-303ab142-3206-9a14-c758-58ab97d7510e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-efb2d1fc-1eed-653d-ed51-5273085154ba
+ host: ip-172-31-91-136
+ memory:
+ total: "1204521451520"
+ os: Linux-6.8.0-1027-aws-x86_64-with-glibc2.35
+ program: /mnt/Meissonic/train/train_mei_video.py
+ python: CPython 3.10.19
+ root: /mnt/Meissonic
+ startedAt: "2025-12-29T09:03:31.275394Z"
+ writerId: emji49jn678obspjvv2gi3mj8a2k6sq4
+ m: []
+ python_version: 3.10.19
+ t:
+ "1":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "2":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "4": 3.10.19
+ "5": 0.23.1
+ "6": 4.57.3
+ "12": 0.23.1
+ "13": linux-x86_64
+adam_beta1:
+ value: 0.9
+adam_beta2:
+ value: 0.999
+adam_epsilon:
+ value: 1e-08
+adam_weight_decay:
+ value: 0.01
+allow_tf32:
+ value: true
+checkpointing_steps:
+ value: 500
+checkpoints_total_limit:
+ value: null
+cond_dropout_prob:
+ value: 0
+dataloader_num_workers:
+ value: 8
+dataloader_prefetch_factor:
+ value: 2
+ema_decay:
+ value: 0.9999
+ema_update_after_step:
+ value: 0
+empty_embeds_path:
+ value: null
+features_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+freeze_wan_backbone:
+ value: false
+gradient_accumulation_steps:
+ value: 1
+gradient_checkpointing:
+ value: true
+image_key:
+ value: null
+instance_data_dataset:
+ value: null
+instance_data_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+instance_data_image:
+ value: null
+instance_dataset:
+ value: OpenVid1MDataset
+learning_rate:
+ value: 0.003
+logging_dir:
+ value: logs
+logging_steps:
+ value: 10
+lora_alpha:
+ value: 32
+lora_r:
+ value: 16
+lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+lr_scheduler:
+ value: constant
+lr_warmup_steps:
+ value: 0
+max_grad_norm:
+ value: 50
+max_train_steps:
+ value: 100000
+min_masking_rate:
+ value: 0
+mixed_precision:
+ value: bf16
+num_frames:
+ value: 17
+output_dir:
+ value: ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+pretrained_model_name_or_path:
+ value: dummy
+prompt_key:
+ value: null
+prompt_prefix:
+ value: null
+report_to:
+ value: wandb
+resolution:
+ value: 512
+resume_from_checkpoint:
+ value: null
+revision:
+ value: null
+scale_lr:
+ value: false
+seed:
+ value: 42
+split_vae_encode:
+ value: 1
+text_encoder_architecture:
+ value: umt5-xxl
+text_encoder_lora_alpha:
+ value: 32
+text_encoder_lora_r:
+ value: 16
+text_encoder_lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+text_encoder_use_lora:
+ value: false
+train_batch_size:
+ value: 1
+train_text_encoder:
+ value: false
+training_from_scratch:
+ value: true
+use_8bit_adam:
+ value: true
+use_ema:
+ value: false
+use_lora:
+ value: false
+use_precomputed_features:
+ value: false
+use_precomputed_video_only:
+ value: true
+validation_prompts:
+ value:
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+validation_steps:
+ value: 100
+variant:
+ value: null
+video_height:
+ value: 128
+video_tokenizer_model_id:
+ value: Cosmos-0.1-Tokenizer-DV4x8x8
+video_width:
+ value: 128
+wan_backbone_lr_ratio:
+ value: 0.2
+wan_pretrained_path:
+ value: /mnt/Wan2.1-T2V-1.3B
diff --git a/Meissonic/wandb/run-20251229_090331-alguiic1/files/requirements.txt b/Meissonic/wandb/run-20251229_090331-alguiic1/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1d01ecad871b6b3baba9900a3b3d370e9205a61d
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_090331-alguiic1/files/requirements.txt
@@ -0,0 +1,151 @@
+ImageIO==2.37.2
+typing-inspection==0.4.2
+av==16.0.1
+dill==0.4.0
+matplotlib==3.10.7
+xxhash==3.6.0
+tap==0.2
+mc_bin_client==1.0.1
+exceptiongroup==1.3.1
+cycler==0.12.1
+einops==0.8.1
+opencv-python==4.12.0.88
+scikit-image==0.25.2
+dashscope==1.25.2
+charset-normalizer==3.4.4
+filelock==3.19.1
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+lazy_loader==0.4
+kiwisolver==1.4.9
+Flask==3.1.2
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+setuptools==80.9.0
+websocket-client==1.9.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+itsdangerous==2.2.0
+pydantic_core==2.41.5
+matrix-game-2.0==0.0.1
+wsproto==1.3.2
+psutil==7.1.3
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+nvidia-cusparselt-cu12==0.7.1
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+datasets==4.4.1
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+lightning==2.6.0
+Flask-SocketIO==5.5.1
+torchvision==0.24.1
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+fonttools==4.61.0
+open_clip_torch==3.2.0
+flash_attn==2.8.3
+mdurl==0.1.2
+pandas==2.3.3
+modelscope==1.32.0
+ftfy==6.3.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+beartype==0.22.8
+dominate==2.9.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+bitsandbytes==0.48.2
+lightning-utilities==0.15.2
+easydict==1.13
+networkx==3.3
+wheel==0.45.1
+timm==1.0.22
+pyparsing==3.2.5
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+pfzy==0.3.4
+httpcore==1.0.9
+multidict==6.7.0
+pycparser==2.23
+regex==2025.11.3
+importlib_metadata==8.7.0
+Werkzeug==3.1.4
+antlr4-python3-runtime==4.9.3
+sentry-sdk==2.46.0
+urllib3==2.5.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+cryptography==46.0.3
+omegaconf==2.3.0
+cffi==2.0.0
+packaging==25.0
+inquirerpy==0.3.4
+aiosignal==1.4.0
+MarkupSafe==2.1.5
+nvidia-cuda-nvrtc-cu12==12.8.93
+tzdata==2025.2
+decord==0.6.0
+async-timeout==5.0.1
+sympy==1.14.0
+numpy==2.1.2
+torch==2.9.1
+diffusers==0.35.2
+nvidia-cuda-cupti-cu12==12.8.90
+smmap==5.0.2
+tifffile==2025.5.10
+safetensors==0.7.0
+gitdb==4.0.12
+blinker==1.9.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+typer-slim==0.20.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+python-engineio==4.12.3
+lmdb==1.7.5
+nvidia-nvtx-cu12==12.8.90
+fsspec==2025.9.0
+markdown-it-py==4.0.0
+six==1.17.0
+platformdirs==4.5.0
+starlette==0.50.0
+scipy==1.15.3
+pycocotools==2.0.10
+accelerate==1.12.0
+zipp==3.23.0
+propcache==0.4.1
+bidict==0.23.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+simple-websocket==1.1.0
+nvidia-curand-cu12==10.3.9.90
+contourpy==1.3.2
+imageio-ffmpeg==0.6.0
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+prompt_toolkit==3.0.52
+pillow==11.3.0
+protobuf==6.33.1
+yarl==1.22.0
+clip==1.0
+nvidia-cudnn-cu12==9.10.2.21
+python-socketio==5.15.0
diff --git a/Meissonic/wandb/run-20251229_090331-alguiic1/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_090331-alguiic1/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..f856e16c6173072367b2fcff139ec2cee04f4a83
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_090331-alguiic1/files/wandb-metadata.json
@@ -0,0 +1,157 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.10.19",
+ "startedAt": "2025-12-29T09:03:31.275394Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "128",
+ "--video_width",
+ "128",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "1",
+ "--gradient_accumulation_steps",
+ "1",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15576135458816"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "emji49jn678obspjvv2gi3mj8a2k6sq4"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_090331-alguiic1/files/wandb-summary.json b/Meissonic/wandb/run-20251229_090331-alguiic1/files/wandb-summary.json
new file mode 100644
index 0000000000000000000000000000000000000000..0c3c0567d4c23124b231f2b4ddb7fa8d30a38646
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_090331-alguiic1/files/wandb-summary.json
@@ -0,0 +1 @@
+{"_runtime":27,"_wandb":{"runtime":27}}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_090331-alguiic1/run-alguiic1.wandb b/Meissonic/wandb/run-20251229_090331-alguiic1/run-alguiic1.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..6688560bf96970ef25e40bac0fd4575b0b095dee
Binary files /dev/null and b/Meissonic/wandb/run-20251229_090331-alguiic1/run-alguiic1.wandb differ
diff --git a/Meissonic/wandb/run-20251229_090648-6beufw2w/files/config.yaml b/Meissonic/wandb/run-20251229_090648-6beufw2w/files/config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..b2c3fec0dabe2b3f94f3dab9b5a59d4f5def2283
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_090648-6beufw2w/files/config.yaml
@@ -0,0 +1,308 @@
+_wandb:
+ value:
+ cli_version: 0.23.1
+ e:
+ fqod8s6y2ph1ko6q2mj7qiafxdgfb2xg:
+ args:
+ - --use_precomputed_video_only
+ - --features_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+ - --text_encoder_architecture
+ - umt5-xxl
+ - --wan_pretrained_path
+ - /mnt/Wan2.1-T2V-1.3B
+ - --training_from_scratch
+ - --pretrained_model_name_or_path
+ - dummy
+ - --wan_backbone_lr_ratio
+ - "0.2"
+ - --num_frames
+ - "17"
+ - --video_height
+ - "128"
+ - --video_width
+ - "128"
+ - --dataloader_num_workers
+ - "8"
+ - --video_tokenizer_model_id
+ - Cosmos-0.1-Tokenizer-DV4x8x8
+ - --instance_dataset
+ - OpenVid1MDataset
+ - --instance_data_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+ - --train_batch_size
+ - "1"
+ - --gradient_accumulation_steps
+ - "1"
+ - --learning_rate
+ - "3e-3"
+ - --max_train_steps
+ - "100000"
+ - --checkpointing_steps
+ - "500"
+ - --validation_steps
+ - "100"
+ - --logging_steps
+ - "10"
+ - --validation_prompts
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+ - --output_dir
+ - ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+ - --mixed_precision
+ - bf16
+ - --lr_scheduler
+ - constant
+ - --lr_warmup_steps
+ - "0"
+ - --use_8bit_adam
+ - --gradient_checkpointing
+ - --min_masking_rate
+ - "0.0"
+ - --cond_dropout_prob
+ - "0.0"
+ - --split_vae_encode
+ - "1"
+ - --allow_tf32
+ - --seed
+ - "42"
+ - --report_to
+ - wandb
+ codePath: train/train_mei_video.py
+ codePathLocal: train/train_mei_video.py
+ cpu_count: 48
+ cpu_count_logical: 96
+ cudaVersion: "12.8"
+ disk:
+ /:
+ total: "16650112278528"
+ used: "15576135680000"
+ email: jinbin5bai@gmail.com
+ executable: /home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10
+ git:
+ commit: 6819d374ef1b86bdedad373aab1121a89687e5cf
+ remote: https://github.com/viiika/Meissonic.git
+ gpu: NVIDIA A100-SXM4-40GB
+ gpu_count: 8
+ gpu_nvidia:
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-71102f28-cd17-57e7-6181-120bf743d23d
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-303ab142-3206-9a14-c758-58ab97d7510e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-efb2d1fc-1eed-653d-ed51-5273085154ba
+ host: ip-172-31-91-136
+ memory:
+ total: "1204521451520"
+ os: Linux-6.8.0-1027-aws-x86_64-with-glibc2.35
+ program: /mnt/Meissonic/train/train_mei_video.py
+ python: CPython 3.10.19
+ root: /mnt/Meissonic
+ startedAt: "2025-12-29T09:06:48.255954Z"
+ writerId: fqod8s6y2ph1ko6q2mj7qiafxdgfb2xg
+ m: []
+ python_version: 3.10.19
+ t:
+ "1":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "2":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "4": 3.10.19
+ "5": 0.23.1
+ "6": 4.57.3
+ "12": 0.23.1
+ "13": linux-x86_64
+adam_beta1:
+ value: 0.9
+adam_beta2:
+ value: 0.999
+adam_epsilon:
+ value: 1e-08
+adam_weight_decay:
+ value: 0.01
+allow_tf32:
+ value: true
+checkpointing_steps:
+ value: 500
+checkpoints_total_limit:
+ value: null
+cond_dropout_prob:
+ value: 0
+dataloader_num_workers:
+ value: 8
+dataloader_prefetch_factor:
+ value: 2
+ema_decay:
+ value: 0.9999
+ema_update_after_step:
+ value: 0
+empty_embeds_path:
+ value: null
+features_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+freeze_wan_backbone:
+ value: false
+gradient_accumulation_steps:
+ value: 1
+gradient_checkpointing:
+ value: true
+image_key:
+ value: null
+instance_data_dataset:
+ value: null
+instance_data_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+instance_data_image:
+ value: null
+instance_dataset:
+ value: OpenVid1MDataset
+learning_rate:
+ value: 0.003
+logging_dir:
+ value: logs
+logging_steps:
+ value: 10
+lora_alpha:
+ value: 32
+lora_r:
+ value: 16
+lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+lr_scheduler:
+ value: constant
+lr_warmup_steps:
+ value: 0
+max_grad_norm:
+ value: 50
+max_train_steps:
+ value: 100000
+min_masking_rate:
+ value: 0
+mixed_precision:
+ value: bf16
+num_frames:
+ value: 17
+output_dir:
+ value: ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+pretrained_model_name_or_path:
+ value: dummy
+prompt_key:
+ value: null
+prompt_prefix:
+ value: null
+report_to:
+ value: wandb
+resolution:
+ value: 512
+resume_from_checkpoint:
+ value: null
+revision:
+ value: null
+scale_lr:
+ value: false
+seed:
+ value: 42
+split_vae_encode:
+ value: 1
+text_encoder_architecture:
+ value: umt5-xxl
+text_encoder_lora_alpha:
+ value: 32
+text_encoder_lora_r:
+ value: 16
+text_encoder_lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+text_encoder_use_lora:
+ value: false
+train_batch_size:
+ value: 1
+train_text_encoder:
+ value: false
+training_from_scratch:
+ value: true
+use_8bit_adam:
+ value: true
+use_ema:
+ value: false
+use_lora:
+ value: false
+use_precomputed_features:
+ value: false
+use_precomputed_video_only:
+ value: true
+validation_prompts:
+ value:
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+validation_steps:
+ value: 100
+variant:
+ value: null
+video_height:
+ value: 128
+video_tokenizer_model_id:
+ value: Cosmos-0.1-Tokenizer-DV4x8x8
+video_width:
+ value: 128
+wan_backbone_lr_ratio:
+ value: 0.2
+wan_pretrained_path:
+ value: /mnt/Wan2.1-T2V-1.3B
diff --git a/Meissonic/wandb/run-20251229_090648-6beufw2w/files/requirements.txt b/Meissonic/wandb/run-20251229_090648-6beufw2w/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1d01ecad871b6b3baba9900a3b3d370e9205a61d
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_090648-6beufw2w/files/requirements.txt
@@ -0,0 +1,151 @@
+ImageIO==2.37.2
+typing-inspection==0.4.2
+av==16.0.1
+dill==0.4.0
+matplotlib==3.10.7
+xxhash==3.6.0
+tap==0.2
+mc_bin_client==1.0.1
+exceptiongroup==1.3.1
+cycler==0.12.1
+einops==0.8.1
+opencv-python==4.12.0.88
+scikit-image==0.25.2
+dashscope==1.25.2
+charset-normalizer==3.4.4
+filelock==3.19.1
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+lazy_loader==0.4
+kiwisolver==1.4.9
+Flask==3.1.2
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+setuptools==80.9.0
+websocket-client==1.9.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+itsdangerous==2.2.0
+pydantic_core==2.41.5
+matrix-game-2.0==0.0.1
+wsproto==1.3.2
+psutil==7.1.3
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+nvidia-cusparselt-cu12==0.7.1
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+datasets==4.4.1
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+lightning==2.6.0
+Flask-SocketIO==5.5.1
+torchvision==0.24.1
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+fonttools==4.61.0
+open_clip_torch==3.2.0
+flash_attn==2.8.3
+mdurl==0.1.2
+pandas==2.3.3
+modelscope==1.32.0
+ftfy==6.3.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+beartype==0.22.8
+dominate==2.9.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+bitsandbytes==0.48.2
+lightning-utilities==0.15.2
+easydict==1.13
+networkx==3.3
+wheel==0.45.1
+timm==1.0.22
+pyparsing==3.2.5
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+pfzy==0.3.4
+httpcore==1.0.9
+multidict==6.7.0
+pycparser==2.23
+regex==2025.11.3
+importlib_metadata==8.7.0
+Werkzeug==3.1.4
+antlr4-python3-runtime==4.9.3
+sentry-sdk==2.46.0
+urllib3==2.5.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+cryptography==46.0.3
+omegaconf==2.3.0
+cffi==2.0.0
+packaging==25.0
+inquirerpy==0.3.4
+aiosignal==1.4.0
+MarkupSafe==2.1.5
+nvidia-cuda-nvrtc-cu12==12.8.93
+tzdata==2025.2
+decord==0.6.0
+async-timeout==5.0.1
+sympy==1.14.0
+numpy==2.1.2
+torch==2.9.1
+diffusers==0.35.2
+nvidia-cuda-cupti-cu12==12.8.90
+smmap==5.0.2
+tifffile==2025.5.10
+safetensors==0.7.0
+gitdb==4.0.12
+blinker==1.9.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+typer-slim==0.20.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+python-engineio==4.12.3
+lmdb==1.7.5
+nvidia-nvtx-cu12==12.8.90
+fsspec==2025.9.0
+markdown-it-py==4.0.0
+six==1.17.0
+platformdirs==4.5.0
+starlette==0.50.0
+scipy==1.15.3
+pycocotools==2.0.10
+accelerate==1.12.0
+zipp==3.23.0
+propcache==0.4.1
+bidict==0.23.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+simple-websocket==1.1.0
+nvidia-curand-cu12==10.3.9.90
+contourpy==1.3.2
+imageio-ffmpeg==0.6.0
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+prompt_toolkit==3.0.52
+pillow==11.3.0
+protobuf==6.33.1
+yarl==1.22.0
+clip==1.0
+nvidia-cudnn-cu12==9.10.2.21
+python-socketio==5.15.0
diff --git a/Meissonic/wandb/run-20251229_090648-6beufw2w/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_090648-6beufw2w/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..419a13ebf9b0f736f0f5343e6d558c65532a6840
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_090648-6beufw2w/files/wandb-metadata.json
@@ -0,0 +1,157 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.10.19",
+ "startedAt": "2025-12-29T09:06:48.255954Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "128",
+ "--video_width",
+ "128",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "1",
+ "--gradient_accumulation_steps",
+ "1",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15576135680000"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "fqod8s6y2ph1ko6q2mj7qiafxdgfb2xg"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_090648-6beufw2w/files/wandb-summary.json b/Meissonic/wandb/run-20251229_090648-6beufw2w/files/wandb-summary.json
new file mode 100644
index 0000000000000000000000000000000000000000..334abaad9a4c576b8b414ca3a7804f2fbc807661
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_090648-6beufw2w/files/wandb-summary.json
@@ -0,0 +1 @@
+{"_wandb":{"runtime":28},"_runtime":28}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_090648-6beufw2w/run-6beufw2w.wandb b/Meissonic/wandb/run-20251229_090648-6beufw2w/run-6beufw2w.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..aa7ad860a98a72d36b73a515e204f54146e7a2eb
Binary files /dev/null and b/Meissonic/wandb/run-20251229_090648-6beufw2w/run-6beufw2w.wandb differ
diff --git a/Meissonic/wandb/run-20251229_091141-ka0jd7f5/files/config.yaml b/Meissonic/wandb/run-20251229_091141-ka0jd7f5/files/config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..5bd636c12b826126acbd9a1017ecf86f2a5560b6
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091141-ka0jd7f5/files/config.yaml
@@ -0,0 +1,308 @@
+_wandb:
+ value:
+ cli_version: 0.23.1
+ e:
+ aepprjfcso077zyaqxob652staeuxwcx:
+ args:
+ - --use_precomputed_video_only
+ - --features_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+ - --text_encoder_architecture
+ - umt5-xxl
+ - --wan_pretrained_path
+ - /mnt/Wan2.1-T2V-1.3B
+ - --training_from_scratch
+ - --pretrained_model_name_or_path
+ - dummy
+ - --wan_backbone_lr_ratio
+ - "0.2"
+ - --num_frames
+ - "17"
+ - --video_height
+ - "128"
+ - --video_width
+ - "128"
+ - --dataloader_num_workers
+ - "8"
+ - --video_tokenizer_model_id
+ - Cosmos-0.1-Tokenizer-DV4x8x8
+ - --instance_dataset
+ - OpenVid1MDataset
+ - --instance_data_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+ - --train_batch_size
+ - "1"
+ - --gradient_accumulation_steps
+ - "1"
+ - --learning_rate
+ - "3e-3"
+ - --max_train_steps
+ - "100000"
+ - --checkpointing_steps
+ - "500"
+ - --validation_steps
+ - "100"
+ - --logging_steps
+ - "10"
+ - --validation_prompts
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+ - --output_dir
+ - ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+ - --mixed_precision
+ - bf16
+ - --lr_scheduler
+ - constant
+ - --lr_warmup_steps
+ - "0"
+ - --use_8bit_adam
+ - --gradient_checkpointing
+ - --min_masking_rate
+ - "0.0"
+ - --cond_dropout_prob
+ - "0.0"
+ - --split_vae_encode
+ - "1"
+ - --allow_tf32
+ - --seed
+ - "42"
+ - --report_to
+ - wandb
+ codePath: train/train_mei_video.py
+ codePathLocal: train/train_mei_video.py
+ cpu_count: 48
+ cpu_count_logical: 96
+ cudaVersion: "12.8"
+ disk:
+ /:
+ total: "16650112278528"
+ used: "15576135843840"
+ email: jinbin5bai@gmail.com
+ executable: /home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10
+ git:
+ commit: 6819d374ef1b86bdedad373aab1121a89687e5cf
+ remote: https://github.com/viiika/Meissonic.git
+ gpu: NVIDIA A100-SXM4-40GB
+ gpu_count: 8
+ gpu_nvidia:
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-71102f28-cd17-57e7-6181-120bf743d23d
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-303ab142-3206-9a14-c758-58ab97d7510e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-efb2d1fc-1eed-653d-ed51-5273085154ba
+ host: ip-172-31-91-136
+ memory:
+ total: "1204521451520"
+ os: Linux-6.8.0-1027-aws-x86_64-with-glibc2.35
+ program: /mnt/Meissonic/train/train_mei_video.py
+ python: CPython 3.10.19
+ root: /mnt/Meissonic
+ startedAt: "2025-12-29T09:11:41.395964Z"
+ writerId: aepprjfcso077zyaqxob652staeuxwcx
+ m: []
+ python_version: 3.10.19
+ t:
+ "1":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "2":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "4": 3.10.19
+ "5": 0.23.1
+ "6": 4.57.3
+ "12": 0.23.1
+ "13": linux-x86_64
+adam_beta1:
+ value: 0.9
+adam_beta2:
+ value: 0.999
+adam_epsilon:
+ value: 1e-08
+adam_weight_decay:
+ value: 0.01
+allow_tf32:
+ value: true
+checkpointing_steps:
+ value: 500
+checkpoints_total_limit:
+ value: null
+cond_dropout_prob:
+ value: 0
+dataloader_num_workers:
+ value: 8
+dataloader_prefetch_factor:
+ value: 2
+ema_decay:
+ value: 0.9999
+ema_update_after_step:
+ value: 0
+empty_embeds_path:
+ value: null
+features_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+freeze_wan_backbone:
+ value: false
+gradient_accumulation_steps:
+ value: 1
+gradient_checkpointing:
+ value: true
+image_key:
+ value: null
+instance_data_dataset:
+ value: null
+instance_data_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+instance_data_image:
+ value: null
+instance_dataset:
+ value: OpenVid1MDataset
+learning_rate:
+ value: 0.003
+logging_dir:
+ value: logs
+logging_steps:
+ value: 10
+lora_alpha:
+ value: 32
+lora_r:
+ value: 16
+lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+lr_scheduler:
+ value: constant
+lr_warmup_steps:
+ value: 0
+max_grad_norm:
+ value: 50
+max_train_steps:
+ value: 100000
+min_masking_rate:
+ value: 0
+mixed_precision:
+ value: bf16
+num_frames:
+ value: 17
+output_dir:
+ value: ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+pretrained_model_name_or_path:
+ value: dummy
+prompt_key:
+ value: null
+prompt_prefix:
+ value: null
+report_to:
+ value: wandb
+resolution:
+ value: 512
+resume_from_checkpoint:
+ value: null
+revision:
+ value: null
+scale_lr:
+ value: false
+seed:
+ value: 42
+split_vae_encode:
+ value: 1
+text_encoder_architecture:
+ value: umt5-xxl
+text_encoder_lora_alpha:
+ value: 32
+text_encoder_lora_r:
+ value: 16
+text_encoder_lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+text_encoder_use_lora:
+ value: false
+train_batch_size:
+ value: 1
+train_text_encoder:
+ value: false
+training_from_scratch:
+ value: true
+use_8bit_adam:
+ value: true
+use_ema:
+ value: false
+use_lora:
+ value: false
+use_precomputed_features:
+ value: false
+use_precomputed_video_only:
+ value: true
+validation_prompts:
+ value:
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+validation_steps:
+ value: 100
+variant:
+ value: null
+video_height:
+ value: 128
+video_tokenizer_model_id:
+ value: Cosmos-0.1-Tokenizer-DV4x8x8
+video_width:
+ value: 128
+wan_backbone_lr_ratio:
+ value: 0.2
+wan_pretrained_path:
+ value: /mnt/Wan2.1-T2V-1.3B
diff --git a/Meissonic/wandb/run-20251229_091141-ka0jd7f5/files/requirements.txt b/Meissonic/wandb/run-20251229_091141-ka0jd7f5/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1d01ecad871b6b3baba9900a3b3d370e9205a61d
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091141-ka0jd7f5/files/requirements.txt
@@ -0,0 +1,151 @@
+ImageIO==2.37.2
+typing-inspection==0.4.2
+av==16.0.1
+dill==0.4.0
+matplotlib==3.10.7
+xxhash==3.6.0
+tap==0.2
+mc_bin_client==1.0.1
+exceptiongroup==1.3.1
+cycler==0.12.1
+einops==0.8.1
+opencv-python==4.12.0.88
+scikit-image==0.25.2
+dashscope==1.25.2
+charset-normalizer==3.4.4
+filelock==3.19.1
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+lazy_loader==0.4
+kiwisolver==1.4.9
+Flask==3.1.2
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+setuptools==80.9.0
+websocket-client==1.9.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+itsdangerous==2.2.0
+pydantic_core==2.41.5
+matrix-game-2.0==0.0.1
+wsproto==1.3.2
+psutil==7.1.3
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+nvidia-cusparselt-cu12==0.7.1
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+datasets==4.4.1
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+lightning==2.6.0
+Flask-SocketIO==5.5.1
+torchvision==0.24.1
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+fonttools==4.61.0
+open_clip_torch==3.2.0
+flash_attn==2.8.3
+mdurl==0.1.2
+pandas==2.3.3
+modelscope==1.32.0
+ftfy==6.3.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+beartype==0.22.8
+dominate==2.9.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+bitsandbytes==0.48.2
+lightning-utilities==0.15.2
+easydict==1.13
+networkx==3.3
+wheel==0.45.1
+timm==1.0.22
+pyparsing==3.2.5
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+pfzy==0.3.4
+httpcore==1.0.9
+multidict==6.7.0
+pycparser==2.23
+regex==2025.11.3
+importlib_metadata==8.7.0
+Werkzeug==3.1.4
+antlr4-python3-runtime==4.9.3
+sentry-sdk==2.46.0
+urllib3==2.5.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+cryptography==46.0.3
+omegaconf==2.3.0
+cffi==2.0.0
+packaging==25.0
+inquirerpy==0.3.4
+aiosignal==1.4.0
+MarkupSafe==2.1.5
+nvidia-cuda-nvrtc-cu12==12.8.93
+tzdata==2025.2
+decord==0.6.0
+async-timeout==5.0.1
+sympy==1.14.0
+numpy==2.1.2
+torch==2.9.1
+diffusers==0.35.2
+nvidia-cuda-cupti-cu12==12.8.90
+smmap==5.0.2
+tifffile==2025.5.10
+safetensors==0.7.0
+gitdb==4.0.12
+blinker==1.9.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+typer-slim==0.20.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+python-engineio==4.12.3
+lmdb==1.7.5
+nvidia-nvtx-cu12==12.8.90
+fsspec==2025.9.0
+markdown-it-py==4.0.0
+six==1.17.0
+platformdirs==4.5.0
+starlette==0.50.0
+scipy==1.15.3
+pycocotools==2.0.10
+accelerate==1.12.0
+zipp==3.23.0
+propcache==0.4.1
+bidict==0.23.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+simple-websocket==1.1.0
+nvidia-curand-cu12==10.3.9.90
+contourpy==1.3.2
+imageio-ffmpeg==0.6.0
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+prompt_toolkit==3.0.52
+pillow==11.3.0
+protobuf==6.33.1
+yarl==1.22.0
+clip==1.0
+nvidia-cudnn-cu12==9.10.2.21
+python-socketio==5.15.0
diff --git a/Meissonic/wandb/run-20251229_091141-ka0jd7f5/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_091141-ka0jd7f5/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..c0e80ce168baf4ec21205386e05f81c123614929
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091141-ka0jd7f5/files/wandb-metadata.json
@@ -0,0 +1,157 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.10.19",
+ "startedAt": "2025-12-29T09:11:41.395964Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "128",
+ "--video_width",
+ "128",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "1",
+ "--gradient_accumulation_steps",
+ "1",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15576135843840"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "aepprjfcso077zyaqxob652staeuxwcx"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_091141-ka0jd7f5/files/wandb-summary.json b/Meissonic/wandb/run-20251229_091141-ka0jd7f5/files/wandb-summary.json
new file mode 100644
index 0000000000000000000000000000000000000000..47f942af33b82da05c85c1fab7e60e57a1ad7d88
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091141-ka0jd7f5/files/wandb-summary.json
@@ -0,0 +1 @@
+{"_runtime":76,"_wandb":{"runtime":76}}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_091141-ka0jd7f5/run-ka0jd7f5.wandb b/Meissonic/wandb/run-20251229_091141-ka0jd7f5/run-ka0jd7f5.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..554d446f9710ac99362fe8ba2d80066970784014
Binary files /dev/null and b/Meissonic/wandb/run-20251229_091141-ka0jd7f5/run-ka0jd7f5.wandb differ
diff --git a/Meissonic/wandb/run-20251229_091548-1gbb0o27/files/config.yaml b/Meissonic/wandb/run-20251229_091548-1gbb0o27/files/config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..16ca22250fcbc32d343ddecd325b5ccb2950b4df
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091548-1gbb0o27/files/config.yaml
@@ -0,0 +1,308 @@
+_wandb:
+ value:
+ cli_version: 0.23.1
+ e:
+ 2s2ya1sag5knet8vv0nugmmk3kg2p4b4:
+ args:
+ - --use_precomputed_video_only
+ - --features_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+ - --text_encoder_architecture
+ - umt5-xxl
+ - --wan_pretrained_path
+ - /mnt/Wan2.1-T2V-1.3B
+ - --training_from_scratch
+ - --pretrained_model_name_or_path
+ - dummy
+ - --wan_backbone_lr_ratio
+ - "0.2"
+ - --num_frames
+ - "17"
+ - --video_height
+ - "128"
+ - --video_width
+ - "128"
+ - --dataloader_num_workers
+ - "8"
+ - --video_tokenizer_model_id
+ - Cosmos-0.1-Tokenizer-DV4x8x8
+ - --instance_dataset
+ - OpenVid1MDataset
+ - --instance_data_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+ - --train_batch_size
+ - "1"
+ - --gradient_accumulation_steps
+ - "1"
+ - --learning_rate
+ - "3e-3"
+ - --max_train_steps
+ - "100000"
+ - --checkpointing_steps
+ - "500"
+ - --validation_steps
+ - "100"
+ - --logging_steps
+ - "10"
+ - --validation_prompts
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+ - --output_dir
+ - ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+ - --mixed_precision
+ - bf16
+ - --lr_scheduler
+ - constant
+ - --lr_warmup_steps
+ - "0"
+ - --use_8bit_adam
+ - --gradient_checkpointing
+ - --min_masking_rate
+ - "0.0"
+ - --cond_dropout_prob
+ - "0.0"
+ - --split_vae_encode
+ - "1"
+ - --allow_tf32
+ - --seed
+ - "42"
+ - --report_to
+ - wandb
+ codePath: train/train_mei_video.py
+ codePathLocal: train/train_mei_video.py
+ cpu_count: 48
+ cpu_count_logical: 96
+ cudaVersion: "12.8"
+ disk:
+ /:
+ total: "16650112278528"
+ used: "15576136171520"
+ email: jinbin5bai@gmail.com
+ executable: /home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10
+ git:
+ commit: 6819d374ef1b86bdedad373aab1121a89687e5cf
+ remote: https://github.com/viiika/Meissonic.git
+ gpu: NVIDIA A100-SXM4-40GB
+ gpu_count: 8
+ gpu_nvidia:
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-71102f28-cd17-57e7-6181-120bf743d23d
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-303ab142-3206-9a14-c758-58ab97d7510e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-efb2d1fc-1eed-653d-ed51-5273085154ba
+ host: ip-172-31-91-136
+ memory:
+ total: "1204521451520"
+ os: Linux-6.8.0-1027-aws-x86_64-with-glibc2.35
+ program: /mnt/Meissonic/train/train_mei_video.py
+ python: CPython 3.10.19
+ root: /mnt/Meissonic
+ startedAt: "2025-12-29T09:15:48.674161Z"
+ writerId: 2s2ya1sag5knet8vv0nugmmk3kg2p4b4
+ m: []
+ python_version: 3.10.19
+ t:
+ "1":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "2":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "4": 3.10.19
+ "5": 0.23.1
+ "6": 4.57.3
+ "12": 0.23.1
+ "13": linux-x86_64
+adam_beta1:
+ value: 0.9
+adam_beta2:
+ value: 0.999
+adam_epsilon:
+ value: 1e-08
+adam_weight_decay:
+ value: 0.01
+allow_tf32:
+ value: true
+checkpointing_steps:
+ value: 500
+checkpoints_total_limit:
+ value: null
+cond_dropout_prob:
+ value: 0
+dataloader_num_workers:
+ value: 8
+dataloader_prefetch_factor:
+ value: 2
+ema_decay:
+ value: 0.9999
+ema_update_after_step:
+ value: 0
+empty_embeds_path:
+ value: null
+features_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+freeze_wan_backbone:
+ value: false
+gradient_accumulation_steps:
+ value: 1
+gradient_checkpointing:
+ value: true
+image_key:
+ value: null
+instance_data_dataset:
+ value: null
+instance_data_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+instance_data_image:
+ value: null
+instance_dataset:
+ value: OpenVid1MDataset
+learning_rate:
+ value: 0.003
+logging_dir:
+ value: logs
+logging_steps:
+ value: 10
+lora_alpha:
+ value: 32
+lora_r:
+ value: 16
+lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+lr_scheduler:
+ value: constant
+lr_warmup_steps:
+ value: 0
+max_grad_norm:
+ value: 50
+max_train_steps:
+ value: 100000
+min_masking_rate:
+ value: 0
+mixed_precision:
+ value: bf16
+num_frames:
+ value: 17
+output_dir:
+ value: ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+pretrained_model_name_or_path:
+ value: dummy
+prompt_key:
+ value: null
+prompt_prefix:
+ value: null
+report_to:
+ value: wandb
+resolution:
+ value: 512
+resume_from_checkpoint:
+ value: null
+revision:
+ value: null
+scale_lr:
+ value: false
+seed:
+ value: 42
+split_vae_encode:
+ value: 1
+text_encoder_architecture:
+ value: umt5-xxl
+text_encoder_lora_alpha:
+ value: 32
+text_encoder_lora_r:
+ value: 16
+text_encoder_lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+text_encoder_use_lora:
+ value: false
+train_batch_size:
+ value: 1
+train_text_encoder:
+ value: false
+training_from_scratch:
+ value: true
+use_8bit_adam:
+ value: true
+use_ema:
+ value: false
+use_lora:
+ value: false
+use_precomputed_features:
+ value: false
+use_precomputed_video_only:
+ value: true
+validation_prompts:
+ value:
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+validation_steps:
+ value: 100
+variant:
+ value: null
+video_height:
+ value: 128
+video_tokenizer_model_id:
+ value: Cosmos-0.1-Tokenizer-DV4x8x8
+video_width:
+ value: 128
+wan_backbone_lr_ratio:
+ value: 0.2
+wan_pretrained_path:
+ value: /mnt/Wan2.1-T2V-1.3B
diff --git a/Meissonic/wandb/run-20251229_091548-1gbb0o27/files/requirements.txt b/Meissonic/wandb/run-20251229_091548-1gbb0o27/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1d01ecad871b6b3baba9900a3b3d370e9205a61d
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091548-1gbb0o27/files/requirements.txt
@@ -0,0 +1,151 @@
+ImageIO==2.37.2
+typing-inspection==0.4.2
+av==16.0.1
+dill==0.4.0
+matplotlib==3.10.7
+xxhash==3.6.0
+tap==0.2
+mc_bin_client==1.0.1
+exceptiongroup==1.3.1
+cycler==0.12.1
+einops==0.8.1
+opencv-python==4.12.0.88
+scikit-image==0.25.2
+dashscope==1.25.2
+charset-normalizer==3.4.4
+filelock==3.19.1
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+lazy_loader==0.4
+kiwisolver==1.4.9
+Flask==3.1.2
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+setuptools==80.9.0
+websocket-client==1.9.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+itsdangerous==2.2.0
+pydantic_core==2.41.5
+matrix-game-2.0==0.0.1
+wsproto==1.3.2
+psutil==7.1.3
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+nvidia-cusparselt-cu12==0.7.1
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+datasets==4.4.1
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+lightning==2.6.0
+Flask-SocketIO==5.5.1
+torchvision==0.24.1
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+fonttools==4.61.0
+open_clip_torch==3.2.0
+flash_attn==2.8.3
+mdurl==0.1.2
+pandas==2.3.3
+modelscope==1.32.0
+ftfy==6.3.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+beartype==0.22.8
+dominate==2.9.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+bitsandbytes==0.48.2
+lightning-utilities==0.15.2
+easydict==1.13
+networkx==3.3
+wheel==0.45.1
+timm==1.0.22
+pyparsing==3.2.5
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+pfzy==0.3.4
+httpcore==1.0.9
+multidict==6.7.0
+pycparser==2.23
+regex==2025.11.3
+importlib_metadata==8.7.0
+Werkzeug==3.1.4
+antlr4-python3-runtime==4.9.3
+sentry-sdk==2.46.0
+urllib3==2.5.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+cryptography==46.0.3
+omegaconf==2.3.0
+cffi==2.0.0
+packaging==25.0
+inquirerpy==0.3.4
+aiosignal==1.4.0
+MarkupSafe==2.1.5
+nvidia-cuda-nvrtc-cu12==12.8.93
+tzdata==2025.2
+decord==0.6.0
+async-timeout==5.0.1
+sympy==1.14.0
+numpy==2.1.2
+torch==2.9.1
+diffusers==0.35.2
+nvidia-cuda-cupti-cu12==12.8.90
+smmap==5.0.2
+tifffile==2025.5.10
+safetensors==0.7.0
+gitdb==4.0.12
+blinker==1.9.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+typer-slim==0.20.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+python-engineio==4.12.3
+lmdb==1.7.5
+nvidia-nvtx-cu12==12.8.90
+fsspec==2025.9.0
+markdown-it-py==4.0.0
+six==1.17.0
+platformdirs==4.5.0
+starlette==0.50.0
+scipy==1.15.3
+pycocotools==2.0.10
+accelerate==1.12.0
+zipp==3.23.0
+propcache==0.4.1
+bidict==0.23.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+simple-websocket==1.1.0
+nvidia-curand-cu12==10.3.9.90
+contourpy==1.3.2
+imageio-ffmpeg==0.6.0
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+prompt_toolkit==3.0.52
+pillow==11.3.0
+protobuf==6.33.1
+yarl==1.22.0
+clip==1.0
+nvidia-cudnn-cu12==9.10.2.21
+python-socketio==5.15.0
diff --git a/Meissonic/wandb/run-20251229_091548-1gbb0o27/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_091548-1gbb0o27/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..cef4197a13fe549f31b6e3aba7dda9d04e1d34ed
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091548-1gbb0o27/files/wandb-metadata.json
@@ -0,0 +1,157 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.10.19",
+ "startedAt": "2025-12-29T09:15:48.674161Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "128",
+ "--video_width",
+ "128",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "1",
+ "--gradient_accumulation_steps",
+ "1",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15576136171520"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "2s2ya1sag5knet8vv0nugmmk3kg2p4b4"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_091548-1gbb0o27/files/wandb-summary.json b/Meissonic/wandb/run-20251229_091548-1gbb0o27/files/wandb-summary.json
new file mode 100644
index 0000000000000000000000000000000000000000..022646b35b7c26b16cc2e065b40d7c01e5cfafde
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091548-1gbb0o27/files/wandb-summary.json
@@ -0,0 +1 @@
+{"_wandb":{"runtime":76},"_runtime":76}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_091548-1gbb0o27/run-1gbb0o27.wandb b/Meissonic/wandb/run-20251229_091548-1gbb0o27/run-1gbb0o27.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..e557c21f74034fe7857a20e793a9052318a1ccab
Binary files /dev/null and b/Meissonic/wandb/run-20251229_091548-1gbb0o27/run-1gbb0o27.wandb differ
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/config.yaml b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..816a53299f5c24ee098aad6bfb62b745f6e178ef
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/config.yaml
@@ -0,0 +1,310 @@
+_wandb:
+ value:
+ cli_version: 0.23.1
+ e:
+ kfxt7o6ujt71x9w49ztafzu7mrji6t6i:
+ args:
+ - --use_precomputed_video_only
+ - --features_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+ - --text_encoder_architecture
+ - umt5-xxl
+ - --wan_pretrained_path
+ - /mnt/Wan2.1-T2V-1.3B
+ - --training_from_scratch
+ - --pretrained_model_name_or_path
+ - dummy
+ - --wan_backbone_lr_ratio
+ - "0.2"
+ - --num_frames
+ - "17"
+ - --video_height
+ - "128"
+ - --video_width
+ - "128"
+ - --dataloader_num_workers
+ - "8"
+ - --video_tokenizer_model_id
+ - Cosmos-0.1-Tokenizer-DV4x8x8
+ - --instance_dataset
+ - OpenVid1MDataset
+ - --instance_data_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+ - --train_batch_size
+ - "1"
+ - --gradient_accumulation_steps
+ - "1"
+ - --learning_rate
+ - "3e-3"
+ - --max_train_steps
+ - "100000"
+ - --checkpointing_steps
+ - "500"
+ - --validation_steps
+ - "100"
+ - --logging_steps
+ - "10"
+ - --validation_prompts
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+ - --output_dir
+ - ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+ - --mixed_precision
+ - bf16
+ - --lr_scheduler
+ - constant
+ - --lr_warmup_steps
+ - "0"
+ - --use_8bit_adam
+ - --gradient_checkpointing
+ - --min_masking_rate
+ - "0.0"
+ - --cond_dropout_prob
+ - "0.0"
+ - --split_vae_encode
+ - "1"
+ - --allow_tf32
+ - --seed
+ - "42"
+ - --report_to
+ - wandb
+ codePath: train/train_mei_video.py
+ codePathLocal: train/train_mei_video.py
+ cpu_count: 48
+ cpu_count_logical: 96
+ cudaVersion: "12.8"
+ disk:
+ /:
+ total: "16650112278528"
+ used: "15576136318976"
+ email: jinbin5bai@gmail.com
+ executable: /home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10
+ git:
+ commit: 6819d374ef1b86bdedad373aab1121a89687e5cf
+ remote: https://github.com/viiika/Meissonic.git
+ gpu: NVIDIA A100-SXM4-40GB
+ gpu_count: 8
+ gpu_nvidia:
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-71102f28-cd17-57e7-6181-120bf743d23d
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-303ab142-3206-9a14-c758-58ab97d7510e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-efb2d1fc-1eed-653d-ed51-5273085154ba
+ host: ip-172-31-91-136
+ memory:
+ total: "1204521451520"
+ os: Linux-6.8.0-1027-aws-x86_64-with-glibc2.35
+ program: /mnt/Meissonic/train/train_mei_video.py
+ python: CPython 3.10.19
+ root: /mnt/Meissonic
+ startedAt: "2025-12-29T09:17:49.878310Z"
+ writerId: kfxt7o6ujt71x9w49ztafzu7mrji6t6i
+ m: []
+ python_version: 3.10.19
+ t:
+ "1":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "2":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ "3":
+ - 61
+ "4": 3.10.19
+ "5": 0.23.1
+ "6": 4.57.3
+ "12": 0.23.1
+ "13": linux-x86_64
+adam_beta1:
+ value: 0.9
+adam_beta2:
+ value: 0.999
+adam_epsilon:
+ value: 1e-08
+adam_weight_decay:
+ value: 0.01
+allow_tf32:
+ value: true
+checkpointing_steps:
+ value: 500
+checkpoints_total_limit:
+ value: null
+cond_dropout_prob:
+ value: 0
+dataloader_num_workers:
+ value: 8
+dataloader_prefetch_factor:
+ value: 2
+ema_decay:
+ value: 0.9999
+ema_update_after_step:
+ value: 0
+empty_embeds_path:
+ value: null
+features_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+freeze_wan_backbone:
+ value: false
+gradient_accumulation_steps:
+ value: 1
+gradient_checkpointing:
+ value: true
+image_key:
+ value: null
+instance_data_dataset:
+ value: null
+instance_data_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+instance_data_image:
+ value: null
+instance_dataset:
+ value: OpenVid1MDataset
+learning_rate:
+ value: 0.003
+logging_dir:
+ value: logs
+logging_steps:
+ value: 10
+lora_alpha:
+ value: 32
+lora_r:
+ value: 16
+lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+lr_scheduler:
+ value: constant
+lr_warmup_steps:
+ value: 0
+max_grad_norm:
+ value: 50
+max_train_steps:
+ value: 100000
+min_masking_rate:
+ value: 0
+mixed_precision:
+ value: bf16
+num_frames:
+ value: 17
+output_dir:
+ value: ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+pretrained_model_name_or_path:
+ value: dummy
+prompt_key:
+ value: null
+prompt_prefix:
+ value: null
+report_to:
+ value: wandb
+resolution:
+ value: 512
+resume_from_checkpoint:
+ value: null
+revision:
+ value: null
+scale_lr:
+ value: false
+seed:
+ value: 42
+split_vae_encode:
+ value: 1
+text_encoder_architecture:
+ value: umt5-xxl
+text_encoder_lora_alpha:
+ value: 32
+text_encoder_lora_r:
+ value: 16
+text_encoder_lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+text_encoder_use_lora:
+ value: false
+train_batch_size:
+ value: 1
+train_text_encoder:
+ value: false
+training_from_scratch:
+ value: true
+use_8bit_adam:
+ value: true
+use_ema:
+ value: false
+use_lora:
+ value: false
+use_precomputed_features:
+ value: false
+use_precomputed_video_only:
+ value: true
+validation_prompts:
+ value:
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+validation_steps:
+ value: 100
+variant:
+ value: null
+video_height:
+ value: 128
+video_tokenizer_model_id:
+ value: Cosmos-0.1-Tokenizer-DV4x8x8
+video_width:
+ value: 128
+wan_backbone_lr_ratio:
+ value: 0.2
+wan_pretrained_path:
+ value: /mnt/Wan2.1-T2V-1.3B
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_100_663a18da6cbce638dca3.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_100_663a18da6cbce638dca3.png
new file mode 100644
index 0000000000000000000000000000000000000000..92d0df87b66a3fe1c8a9d9f81516a616e7b70465
Binary files /dev/null and b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_100_663a18da6cbce638dca3.png differ
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_100_941c6ce47b081cba0b27.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_100_941c6ce47b081cba0b27.png
new file mode 100644
index 0000000000000000000000000000000000000000..c6a554189a2c94812a03e01db8e7785458656d7c
Binary files /dev/null and b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_100_941c6ce47b081cba0b27.png differ
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_100_afecdb2b419aec330b4f.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_100_afecdb2b419aec330b4f.png
new file mode 100644
index 0000000000000000000000000000000000000000..a9e2fc24a3576beb0a2b72a34a57589594b2218c
Binary files /dev/null and b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_100_afecdb2b419aec330b4f.png differ
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_100_e9e362a34fd622039019.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_100_e9e362a34fd622039019.png
new file mode 100644
index 0000000000000000000000000000000000000000..d1ac4e654261bae35bb0cbacbc58fda56f047b50
Binary files /dev/null and b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_100_e9e362a34fd622039019.png differ
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_200_6ad83552ea32063cad15.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_200_6ad83552ea32063cad15.png
new file mode 100644
index 0000000000000000000000000000000000000000..dcef3f73036c537f3fee52b91208b8f3e345ec6a
Binary files /dev/null and b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_200_6ad83552ea32063cad15.png differ
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_200_70f1ac0d7a90bfcb75d4.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_200_70f1ac0d7a90bfcb75d4.png
new file mode 100644
index 0000000000000000000000000000000000000000..cc9fd244f9a146c0d43ce432f715459725bd4655
Binary files /dev/null and b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_200_70f1ac0d7a90bfcb75d4.png differ
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_200_deb159e6b9240b881b93.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_200_deb159e6b9240b881b93.png
new file mode 100644
index 0000000000000000000000000000000000000000..f1e29f14224b25563b8b4ba74f7a2bbb1caea5a6
Binary files /dev/null and b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_200_deb159e6b9240b881b93.png differ
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_200_e44ef9ece28e7f977230.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_200_e44ef9ece28e7f977230.png
new file mode 100644
index 0000000000000000000000000000000000000000..03922e2d24ed7d4fbaa528801536a42260b38778
Binary files /dev/null and b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_200_e44ef9ece28e7f977230.png differ
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_300_6884549b0303adefd42d.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_300_6884549b0303adefd42d.png
new file mode 100644
index 0000000000000000000000000000000000000000..1ab0230a6383876ee07d575f4ddc57a1001ba001
Binary files /dev/null and b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_300_6884549b0303adefd42d.png differ
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_300_8f013c8d3331dd06354c.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_300_8f013c8d3331dd06354c.png
new file mode 100644
index 0000000000000000000000000000000000000000..7cbb42af9a3809e05ce17c720c96345b38bd0c3c
Binary files /dev/null and b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_300_8f013c8d3331dd06354c.png differ
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_300_93afeb6257974016a899.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_300_93afeb6257974016a899.png
new file mode 100644
index 0000000000000000000000000000000000000000..8937cd31f03b3e46e3ab773bfcba07fef9a02024
Binary files /dev/null and b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_300_93afeb6257974016a899.png differ
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_300_9d423069b5a9cd2104da.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_300_9d423069b5a9cd2104da.png
new file mode 100644
index 0000000000000000000000000000000000000000..367b87dc8a9c4d6a2cd7140f8592378d51221d7a
Binary files /dev/null and b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_300_9d423069b5a9cd2104da.png differ
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_400_0f42b78b94bd3a593399.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_400_0f42b78b94bd3a593399.png
new file mode 100644
index 0000000000000000000000000000000000000000..966b1529c830b314426bafddeb6208d4f1a34f47
Binary files /dev/null and b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_400_0f42b78b94bd3a593399.png differ
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_400_1d62580a9de1268911e7.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_400_1d62580a9de1268911e7.png
new file mode 100644
index 0000000000000000000000000000000000000000..817a5d23ab31015d31cab15e3a6464117abc8fdf
Binary files /dev/null and b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_400_1d62580a9de1268911e7.png differ
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_400_35b6b89b5aac853dd909.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_400_35b6b89b5aac853dd909.png
new file mode 100644
index 0000000000000000000000000000000000000000..a2583a0336c78333de93eea34fe5948b13fcf51e
Binary files /dev/null and b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_400_35b6b89b5aac853dd909.png differ
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_400_adfbcce8afb45b1c30a8.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_400_adfbcce8afb45b1c30a8.png
new file mode 100644
index 0000000000000000000000000000000000000000..576846cc0a15f585be2f6fe0202e46c29a7a4e36
Binary files /dev/null and b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_first_frame_400_adfbcce8afb45b1c30a8.png differ
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_100_34a39a5e2650e19f4ae1.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_100_34a39a5e2650e19f4ae1.png
new file mode 100644
index 0000000000000000000000000000000000000000..e62e424bc1bd45f7d1e357b8b2c316ec41bc4145
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_100_34a39a5e2650e19f4ae1.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:34a39a5e2650e19f4ae11bb0ecc0f91e383596dc4ebaec8c93dbb2c54f34cf6b
+size 334833
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_100_387ccd362ea3ed3ef742.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_100_387ccd362ea3ed3ef742.png
new file mode 100644
index 0000000000000000000000000000000000000000..5a12bdbe11696145d64e42f7594d32abdeb27f1e
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_100_387ccd362ea3ed3ef742.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:387ccd362ea3ed3ef742fab8130f3dfc9acc7d4142ebd95eb9454ff5cf11b6e5
+size 353413
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_100_57ac9d7ca6f53f2d5750.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_100_57ac9d7ca6f53f2d5750.png
new file mode 100644
index 0000000000000000000000000000000000000000..e6837fff66af50236ae5163a2c36a32b29cdbd08
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_100_57ac9d7ca6f53f2d5750.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:57ac9d7ca6f53f2d57508a0763dae9e1205574fa7b0896eeba06a99cdbbc9d00
+size 340649
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_100_7d2b159c135bddff0ef3.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_100_7d2b159c135bddff0ef3.png
new file mode 100644
index 0000000000000000000000000000000000000000..ea26849ae5373ddc313710295a8a041700269623
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_100_7d2b159c135bddff0ef3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7d2b159c135bddff0ef32d29efa51926e89b5a01975320633a56c8e05adf70b0
+size 376050
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_200_21c04da9675e4094286b.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_200_21c04da9675e4094286b.png
new file mode 100644
index 0000000000000000000000000000000000000000..f4e50a10dd6fe1a557ff85678b8d0b2c004c5fdf
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_200_21c04da9675e4094286b.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:21c04da9675e4094286b2f9c37ea4a8df5d5db82dd0c7fc7d8ae1c63b28e8abf
+size 327552
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_200_cd0e9d5c1e3bee9d1089.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_200_cd0e9d5c1e3bee9d1089.png
new file mode 100644
index 0000000000000000000000000000000000000000..d2346322fc72f2362d43bd2ea9f6f65b119f4167
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_200_cd0e9d5c1e3bee9d1089.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:cd0e9d5c1e3bee9d108922d237bf6fb2817e933fe7fa00d73f6e29d44709ae15
+size 325346
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_200_f495450d941a045dd727.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_200_f495450d941a045dd727.png
new file mode 100644
index 0000000000000000000000000000000000000000..24e3e3cef1ff2f34fc91158cd88cab5cdb00e8a8
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_200_f495450d941a045dd727.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f495450d941a045dd7271283c1eefac206238ca8b5f4cdf2c89ccfcc6e611350
+size 305454
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_200_fcaa96191b1a00069048.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_200_fcaa96191b1a00069048.png
new file mode 100644
index 0000000000000000000000000000000000000000..250695b730fb7285f294d0beb3f1479925227e97
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_200_fcaa96191b1a00069048.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:fcaa96191b1a000690484963ac35c17cb76aef3033093dc69f41a3808d40a46e
+size 317834
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_300_0b2dad03f451e5b32137.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_300_0b2dad03f451e5b32137.png
new file mode 100644
index 0000000000000000000000000000000000000000..60dd8b8d761c2ae7c92d404c791d8d6c3b944682
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_300_0b2dad03f451e5b32137.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:0b2dad03f451e5b321374f76ff243c18ddf85461e2683894917a1d58b7164262
+size 319125
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_300_57cd6b7c3c28a84604f0.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_300_57cd6b7c3c28a84604f0.png
new file mode 100644
index 0000000000000000000000000000000000000000..469414105fa31af8cc37b05447f36c9c3cee7823
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_300_57cd6b7c3c28a84604f0.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:57cd6b7c3c28a84604f04d652246c694d725377b4b3c29bd741459d4d6646ff1
+size 348380
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_300_67dd98ca86f58e8f991f.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_300_67dd98ca86f58e8f991f.png
new file mode 100644
index 0000000000000000000000000000000000000000..5ebff783a19e23cb0baaef4e6a319033bf3f4535
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_300_67dd98ca86f58e8f991f.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:67dd98ca86f58e8f991f703bc249b3c265167505db9fec7ff2130a19db26ec73
+size 311039
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_300_9c6ec3b5d5c6c004719d.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_300_9c6ec3b5d5c6c004719d.png
new file mode 100644
index 0000000000000000000000000000000000000000..92a0497c65e616c040c82dd5281c20dec57f72f3
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_300_9c6ec3b5d5c6c004719d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9c6ec3b5d5c6c004719d924f78b24fb8f9141a0367a9f706a4f2e5e51ee2c354
+size 303314
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_400_5d2f188185d80ef48816.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_400_5d2f188185d80ef48816.png
new file mode 100644
index 0000000000000000000000000000000000000000..022b703a05a93d0d6dfe6f99dca4a07f2dc1dad6
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_400_5d2f188185d80ef48816.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5d2f188185d80ef48816605d781f7656520844d8ec8d681b7fc647ac7eb3a2b9
+size 326517
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_400_80c60e5f11bb725d3b92.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_400_80c60e5f11bb725d3b92.png
new file mode 100644
index 0000000000000000000000000000000000000000..f27a9583fc044233680b96cab88bdc92809a2649
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_400_80c60e5f11bb725d3b92.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:80c60e5f11bb725d3b92b972b5f4890b16ce43aad86fcf6fb7d9df2f6d9ab655
+size 295528
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_400_954df44741af2fe7a2fd.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_400_954df44741af2fe7a2fd.png
new file mode 100644
index 0000000000000000000000000000000000000000..74545c277a5dd2f33c4c38b4ebc589a1a6727c9c
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_400_954df44741af2fe7a2fd.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:954df44741af2fe7a2fd3b2c992dc41d5b6b7e7efc87c820fe2c6c5501a27f11
+size 302306
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_400_9f66440974f233dda178.png b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_400_9f66440974f233dda178.png
new file mode 100644
index 0000000000000000000000000000000000000000..1235f8e5079f6adce83d81df193088cae32cc709
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/media/images/generated_videos_grid_400_9f66440974f233dda178.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:9f66440974f233dda17839f69eb368db8ef3ce3774282f1bf3762070e4f05c51
+size 305209
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/requirements.txt b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1d01ecad871b6b3baba9900a3b3d370e9205a61d
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/requirements.txt
@@ -0,0 +1,151 @@
+ImageIO==2.37.2
+typing-inspection==0.4.2
+av==16.0.1
+dill==0.4.0
+matplotlib==3.10.7
+xxhash==3.6.0
+tap==0.2
+mc_bin_client==1.0.1
+exceptiongroup==1.3.1
+cycler==0.12.1
+einops==0.8.1
+opencv-python==4.12.0.88
+scikit-image==0.25.2
+dashscope==1.25.2
+charset-normalizer==3.4.4
+filelock==3.19.1
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+lazy_loader==0.4
+kiwisolver==1.4.9
+Flask==3.1.2
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+setuptools==80.9.0
+websocket-client==1.9.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+itsdangerous==2.2.0
+pydantic_core==2.41.5
+matrix-game-2.0==0.0.1
+wsproto==1.3.2
+psutil==7.1.3
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+nvidia-cusparselt-cu12==0.7.1
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+datasets==4.4.1
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+lightning==2.6.0
+Flask-SocketIO==5.5.1
+torchvision==0.24.1
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+fonttools==4.61.0
+open_clip_torch==3.2.0
+flash_attn==2.8.3
+mdurl==0.1.2
+pandas==2.3.3
+modelscope==1.32.0
+ftfy==6.3.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+beartype==0.22.8
+dominate==2.9.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+bitsandbytes==0.48.2
+lightning-utilities==0.15.2
+easydict==1.13
+networkx==3.3
+wheel==0.45.1
+timm==1.0.22
+pyparsing==3.2.5
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+pfzy==0.3.4
+httpcore==1.0.9
+multidict==6.7.0
+pycparser==2.23
+regex==2025.11.3
+importlib_metadata==8.7.0
+Werkzeug==3.1.4
+antlr4-python3-runtime==4.9.3
+sentry-sdk==2.46.0
+urllib3==2.5.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+cryptography==46.0.3
+omegaconf==2.3.0
+cffi==2.0.0
+packaging==25.0
+inquirerpy==0.3.4
+aiosignal==1.4.0
+MarkupSafe==2.1.5
+nvidia-cuda-nvrtc-cu12==12.8.93
+tzdata==2025.2
+decord==0.6.0
+async-timeout==5.0.1
+sympy==1.14.0
+numpy==2.1.2
+torch==2.9.1
+diffusers==0.35.2
+nvidia-cuda-cupti-cu12==12.8.90
+smmap==5.0.2
+tifffile==2025.5.10
+safetensors==0.7.0
+gitdb==4.0.12
+blinker==1.9.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+typer-slim==0.20.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+python-engineio==4.12.3
+lmdb==1.7.5
+nvidia-nvtx-cu12==12.8.90
+fsspec==2025.9.0
+markdown-it-py==4.0.0
+six==1.17.0
+platformdirs==4.5.0
+starlette==0.50.0
+scipy==1.15.3
+pycocotools==2.0.10
+accelerate==1.12.0
+zipp==3.23.0
+propcache==0.4.1
+bidict==0.23.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+simple-websocket==1.1.0
+nvidia-curand-cu12==10.3.9.90
+contourpy==1.3.2
+imageio-ffmpeg==0.6.0
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+prompt_toolkit==3.0.52
+pillow==11.3.0
+protobuf==6.33.1
+yarl==1.22.0
+clip==1.0
+nvidia-cudnn-cu12==9.10.2.21
+python-socketio==5.15.0
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..8f532fa04cdcc510bc507dd91be7257705ea104f
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/wandb-metadata.json
@@ -0,0 +1,157 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.10.19",
+ "startedAt": "2025-12-29T09:17:49.878310Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "128",
+ "--video_width",
+ "128",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "1",
+ "--gradient_accumulation_steps",
+ "1",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/matrix-game2/bin/python3.10",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15576136318976"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "kfxt7o6ujt71x9w49ztafzu7mrji6t6i"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/wandb-summary.json b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/wandb-summary.json
new file mode 100644
index 0000000000000000000000000000000000000000..447ffd19e2b55ec31fba41619192284b6bcf0048
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091749-rjwft5vo/files/wandb-summary.json
@@ -0,0 +1 @@
+{"generated_videos_first_frame":{"filenames":["media/images/generated_videos_first_frame_400_1d62580a9de1268911e7.png","media/images/generated_videos_first_frame_400_adfbcce8afb45b1c30a8.png","media/images/generated_videos_first_frame_400_35b6b89b5aac853dd909.png","media/images/generated_videos_first_frame_400_0f42b78b94bd3a593399.png"],"captions":["a cat playing","a girl walking","The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.","The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner."],"_type":"images/separated","width":128,"height":128,"format":"png","count":4},"lr":0.0006000000000000001,"_wandb":{"runtime":585},"_runtime":585.580046805,"_timestamp":1.7670003995006068e+09,"_step":500,"step_loss":10.336981773376465,"generated_videos_grid":{"size":326517,"caption":"video_3_grid","_type":"image-file","sha256":"5d2f188185d80ef48816605d781f7656520844d8ec8d681b7fc647ac7eb3a2b9","path":"media/images/generated_videos_grid_400_5d2f188185d80ef48816.png","format":"png","width":522,"height":522},"avg_masking_rate":0.12160906940698624}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_091749-rjwft5vo/run-rjwft5vo.wandb b/Meissonic/wandb/run-20251229_091749-rjwft5vo/run-rjwft5vo.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..ed5283acafeb2ee6ccbdaf43586fc9c026aae814
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_091749-rjwft5vo/run-rjwft5vo.wandb
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1e049ad98ac15273d98a7cca440e40b83f486e7615b939df7172cc406427d4e4
+size 324088
diff --git a/Meissonic/wandb/run-20251229_092338-9aiqswtw/files/config.yaml b/Meissonic/wandb/run-20251229_092338-9aiqswtw/files/config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..e8686544c627b3fd40d62cf6507c84385fef89f1
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_092338-9aiqswtw/files/config.yaml
@@ -0,0 +1,310 @@
+_wandb:
+ value:
+ cli_version: 0.23.1
+ e:
+ bljc882in14q19pdqk4un5uzm8o9i1ww:
+ args:
+ - --use_precomputed_video_only
+ - --features_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+ - --text_encoder_architecture
+ - umt5-xxl
+ - --wan_pretrained_path
+ - /mnt/Wan2.1-T2V-1.3B
+ - --training_from_scratch
+ - --pretrained_model_name_or_path
+ - dummy
+ - --wan_backbone_lr_ratio
+ - "0.2"
+ - --num_frames
+ - "17"
+ - --video_height
+ - "128"
+ - --video_width
+ - "128"
+ - --dataloader_num_workers
+ - "8"
+ - --video_tokenizer_model_id
+ - Cosmos-0.1-Tokenizer-DV4x8x8
+ - --instance_dataset
+ - OpenVid1MDataset
+ - --instance_data_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+ - --train_batch_size
+ - "1"
+ - --gradient_accumulation_steps
+ - "1"
+ - --learning_rate
+ - "3e-3"
+ - --max_train_steps
+ - "100000"
+ - --checkpointing_steps
+ - "500"
+ - --validation_steps
+ - "100"
+ - --logging_steps
+ - "10"
+ - --validation_prompts
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+ - --output_dir
+ - ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+ - --mixed_precision
+ - bf16
+ - --lr_scheduler
+ - constant
+ - --lr_warmup_steps
+ - "0"
+ - --use_8bit_adam
+ - --gradient_checkpointing
+ - --min_masking_rate
+ - "0.0"
+ - --cond_dropout_prob
+ - "0.0"
+ - --split_vae_encode
+ - "1"
+ - --allow_tf32
+ - --seed
+ - "42"
+ - --report_to
+ - wandb
+ codePath: train/train_mei_video.py
+ codePathLocal: train/train_mei_video.py
+ cpu_count: 48
+ cpu_count_logical: 96
+ cudaVersion: "12.8"
+ disk:
+ /:
+ total: "16650112278528"
+ used: "15584534986752"
+ email: jinbin5bai@gmail.com
+ executable: /home/ubuntu/miniconda3/envs/mei-video/bin/python3.13
+ git:
+ commit: 6819d374ef1b86bdedad373aab1121a89687e5cf
+ remote: https://github.com/viiika/Meissonic.git
+ gpu: NVIDIA A100-SXM4-40GB
+ gpu_count: 8
+ gpu_nvidia:
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-71102f28-cd17-57e7-6181-120bf743d23d
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-303ab142-3206-9a14-c758-58ab97d7510e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-efb2d1fc-1eed-653d-ed51-5273085154ba
+ host: ip-172-31-91-136
+ memory:
+ total: "1204521451520"
+ os: Linux-6.8.0-1027-aws-x86_64-with-glibc2.35
+ program: /mnt/Meissonic/train/train_mei_video.py
+ python: CPython 3.13.11
+ root: /mnt/Meissonic
+ startedAt: "2025-12-29T09:23:38.591647Z"
+ writerId: bljc882in14q19pdqk4un5uzm8o9i1ww
+ m: []
+ python_version: 3.13.11
+ t:
+ "1":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ - 105
+ "2":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ - 105
+ "4": 3.13.11
+ "5": 0.23.1
+ "6": 4.57.3
+ "12": 0.23.1
+ "13": linux-x86_64
+adam_beta1:
+ value: 0.9
+adam_beta2:
+ value: 0.999
+adam_epsilon:
+ value: 1e-08
+adam_weight_decay:
+ value: 0.01
+allow_tf32:
+ value: true
+checkpointing_steps:
+ value: 500
+checkpoints_total_limit:
+ value: null
+cond_dropout_prob:
+ value: 0
+dataloader_num_workers:
+ value: 8
+dataloader_prefetch_factor:
+ value: 2
+ema_decay:
+ value: 0.9999
+ema_update_after_step:
+ value: 0
+empty_embeds_path:
+ value: null
+features_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+freeze_wan_backbone:
+ value: false
+gradient_accumulation_steps:
+ value: 1
+gradient_checkpointing:
+ value: true
+image_key:
+ value: null
+instance_data_dataset:
+ value: null
+instance_data_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+instance_data_image:
+ value: null
+instance_dataset:
+ value: OpenVid1MDataset
+learning_rate:
+ value: 0.003
+logging_dir:
+ value: logs
+logging_steps:
+ value: 10
+lora_alpha:
+ value: 32
+lora_r:
+ value: 16
+lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+lr_scheduler:
+ value: constant
+lr_warmup_steps:
+ value: 0
+max_grad_norm:
+ value: 50
+max_train_steps:
+ value: 100000
+min_masking_rate:
+ value: 0
+mixed_precision:
+ value: bf16
+num_frames:
+ value: 17
+output_dir:
+ value: ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+pretrained_model_name_or_path:
+ value: dummy
+prompt_key:
+ value: null
+prompt_prefix:
+ value: null
+report_to:
+ value: wandb
+resolution:
+ value: 512
+resume_from_checkpoint:
+ value: null
+revision:
+ value: null
+scale_lr:
+ value: false
+seed:
+ value: 42
+split_vae_encode:
+ value: 1
+text_encoder_architecture:
+ value: umt5-xxl
+text_encoder_lora_alpha:
+ value: 32
+text_encoder_lora_r:
+ value: 16
+text_encoder_lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+text_encoder_use_lora:
+ value: false
+train_batch_size:
+ value: 1
+train_text_encoder:
+ value: false
+training_from_scratch:
+ value: true
+use_8bit_adam:
+ value: true
+use_ema:
+ value: false
+use_lora:
+ value: false
+use_precomputed_features:
+ value: false
+use_precomputed_video_only:
+ value: true
+validation_prompts:
+ value:
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+validation_steps:
+ value: 100
+variant:
+ value: null
+video_height:
+ value: 128
+video_tokenizer_model_id:
+ value: Cosmos-0.1-Tokenizer-DV4x8x8
+video_width:
+ value: 128
+wan_backbone_lr_ratio:
+ value: 0.2
+wan_pretrained_path:
+ value: /mnt/Wan2.1-T2V-1.3B
diff --git a/Meissonic/wandb/run-20251229_092338-9aiqswtw/files/requirements.txt b/Meissonic/wandb/run-20251229_092338-9aiqswtw/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..a15fd034e55caa7ed0e61870cf7b5d86e79963b3
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_092338-9aiqswtw/files/requirements.txt
@@ -0,0 +1,122 @@
+typing-inspection==0.4.2
+dill==0.4.0
+ffmpy==1.0.0
+xxhash==3.6.0
+partd==1.4.2
+brotli==1.2.0
+charset-normalizer==3.4.4
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+orjson==3.11.5
+numpy==2.4.0
+pydantic_core==2.41.5
+groovy==0.1.2
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+MarkupSafe==3.0.3
+protobuf==6.33.2
+nvidia-cusparselt-cu12==0.7.1
+locket==1.0.0
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+pydub==0.25.1
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+torchvision==0.24.1
+cloudpickle==3.1.2
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+open_clip_torch==3.2.0
+mdurl==0.1.2
+pandas==2.3.3
+toolz==1.1.0
+python-multipart==0.0.21
+ftfy==6.3.1
+platformdirs==4.5.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+typer==0.21.0
+lightning-utilities==0.15.2
+gradio_client==2.0.2
+wheel==0.45.1
+timm==1.0.22
+semantic-version==2.10.0
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+importlib_metadata==8.7.1
+httpcore==1.0.9
+fsspec==2025.10.0
+multidict==6.7.0
+regex==2025.11.3
+bitsandbytes==0.49.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+uvicorn==0.40.0
+packaging==25.0
+aiosignal==1.4.0
+nvidia-cuda-nvrtc-cu12==12.8.93
+networkx==3.6.1
+setuptools==80.9.0
+sympy==1.14.0
+torch==2.9.1
+nvidia-cuda-cupti-cu12==12.8.90
+gradio==6.2.0
+smmap==5.0.2
+safetensors==0.7.0
+gitdb==4.0.12
+safehttpx==0.1.7
+fastapi==0.128.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+pillow==12.0.0
+sentry-sdk==2.48.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+tzdata==2025.3
+nvidia-nvtx-cu12==12.8.90
+filelock==3.20.1
+markdown-it-py==4.0.0
+six==1.17.0
+starlette==0.50.0
+audioop-lts==0.2.2
+urllib3==2.6.2
+accelerate==1.12.0
+psutil==7.2.1
+diffusers==0.36.0
+annotated-doc==0.0.4
+zipp==3.23.0
+propcache==0.4.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+nvidia-curand-cu12==10.3.9.90
+datasets==4.4.2
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+aiofiles==24.1.0
+dask==2025.12.0
+yarl==1.22.0
+nvidia-cudnn-cu12==9.10.2.21
+tomlkit==0.13.3
diff --git a/Meissonic/wandb/run-20251229_092338-9aiqswtw/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_092338-9aiqswtw/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..51c58f47591a4ade8344774bcf2461a00a272ae4
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_092338-9aiqswtw/files/wandb-metadata.json
@@ -0,0 +1,157 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.13.11",
+ "startedAt": "2025-12-29T09:23:38.591647Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "128",
+ "--video_width",
+ "128",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "1",
+ "--gradient_accumulation_steps",
+ "1",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/mei-video/bin/python3.13",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15584534986752"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "bljc882in14q19pdqk4un5uzm8o9i1ww"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_092338-9aiqswtw/files/wandb-summary.json b/Meissonic/wandb/run-20251229_092338-9aiqswtw/files/wandb-summary.json
new file mode 100644
index 0000000000000000000000000000000000000000..75631e6e43bf8471ee6ca35a1ab1286f569677cb
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_092338-9aiqswtw/files/wandb-summary.json
@@ -0,0 +1 @@
+{"_wandb":{"runtime":3},"_runtime":3}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_092338-9aiqswtw/run-9aiqswtw.wandb b/Meissonic/wandb/run-20251229_092338-9aiqswtw/run-9aiqswtw.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..f8704d203fc1aa7f70dc9ef5ab095edf1b75241b
Binary files /dev/null and b/Meissonic/wandb/run-20251229_092338-9aiqswtw/run-9aiqswtw.wandb differ
diff --git a/Meissonic/wandb/run-20251229_092754-hkolswde/files/config.yaml b/Meissonic/wandb/run-20251229_092754-hkolswde/files/config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..b212c1c08c6c9165b0dbc5ed7df9a30e3d52c391
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_092754-hkolswde/files/config.yaml
@@ -0,0 +1,310 @@
+_wandb:
+ value:
+ cli_version: 0.23.1
+ e:
+ qek4phtgw47r5nb6ur2jyeakhg2xjsmd:
+ args:
+ - --use_precomputed_video_only
+ - --features_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+ - --text_encoder_architecture
+ - umt5-xxl
+ - --wan_pretrained_path
+ - /mnt/Wan2.1-T2V-1.3B
+ - --training_from_scratch
+ - --pretrained_model_name_or_path
+ - dummy
+ - --wan_backbone_lr_ratio
+ - "0.2"
+ - --num_frames
+ - "17"
+ - --video_height
+ - "128"
+ - --video_width
+ - "128"
+ - --dataloader_num_workers
+ - "8"
+ - --video_tokenizer_model_id
+ - Cosmos-0.1-Tokenizer-DV4x8x8
+ - --instance_dataset
+ - OpenVid1MDataset
+ - --instance_data_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+ - --train_batch_size
+ - "1"
+ - --gradient_accumulation_steps
+ - "1"
+ - --learning_rate
+ - "3e-3"
+ - --max_train_steps
+ - "100000"
+ - --checkpointing_steps
+ - "500"
+ - --validation_steps
+ - "100"
+ - --logging_steps
+ - "10"
+ - --validation_prompts
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+ - --output_dir
+ - ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+ - --mixed_precision
+ - bf16
+ - --lr_scheduler
+ - constant
+ - --lr_warmup_steps
+ - "0"
+ - --use_8bit_adam
+ - --gradient_checkpointing
+ - --min_masking_rate
+ - "0.0"
+ - --cond_dropout_prob
+ - "0.0"
+ - --split_vae_encode
+ - "1"
+ - --allow_tf32
+ - --seed
+ - "42"
+ - --report_to
+ - wandb
+ codePath: train/train_mei_video.py
+ codePathLocal: train/train_mei_video.py
+ cpu_count: 48
+ cpu_count_logical: 96
+ cudaVersion: "12.8"
+ disk:
+ /:
+ total: "16650112278528"
+ used: "15584537948160"
+ email: jinbin5bai@gmail.com
+ executable: /home/ubuntu/miniconda3/envs/mei-video/bin/python3.13
+ git:
+ commit: 6819d374ef1b86bdedad373aab1121a89687e5cf
+ remote: https://github.com/viiika/Meissonic.git
+ gpu: NVIDIA A100-SXM4-40GB
+ gpu_count: 8
+ gpu_nvidia:
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-71102f28-cd17-57e7-6181-120bf743d23d
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-303ab142-3206-9a14-c758-58ab97d7510e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-efb2d1fc-1eed-653d-ed51-5273085154ba
+ host: ip-172-31-91-136
+ memory:
+ total: "1204521451520"
+ os: Linux-6.8.0-1027-aws-x86_64-with-glibc2.35
+ program: /mnt/Meissonic/train/train_mei_video.py
+ python: CPython 3.13.11
+ root: /mnt/Meissonic
+ startedAt: "2025-12-29T09:27:54.387328Z"
+ writerId: qek4phtgw47r5nb6ur2jyeakhg2xjsmd
+ m: []
+ python_version: 3.13.11
+ t:
+ "1":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ - 105
+ "2":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ - 105
+ "4": 3.13.11
+ "5": 0.23.1
+ "6": 4.57.3
+ "12": 0.23.1
+ "13": linux-x86_64
+adam_beta1:
+ value: 0.9
+adam_beta2:
+ value: 0.999
+adam_epsilon:
+ value: 1e-08
+adam_weight_decay:
+ value: 0.01
+allow_tf32:
+ value: true
+checkpointing_steps:
+ value: 500
+checkpoints_total_limit:
+ value: null
+cond_dropout_prob:
+ value: 0
+dataloader_num_workers:
+ value: 8
+dataloader_prefetch_factor:
+ value: 2
+ema_decay:
+ value: 0.9999
+ema_update_after_step:
+ value: 0
+empty_embeds_path:
+ value: null
+features_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+freeze_wan_backbone:
+ value: false
+gradient_accumulation_steps:
+ value: 1
+gradient_checkpointing:
+ value: true
+image_key:
+ value: null
+instance_data_dataset:
+ value: null
+instance_data_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+instance_data_image:
+ value: null
+instance_dataset:
+ value: OpenVid1MDataset
+learning_rate:
+ value: 0.003
+logging_dir:
+ value: logs
+logging_steps:
+ value: 10
+lora_alpha:
+ value: 32
+lora_r:
+ value: 16
+lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+lr_scheduler:
+ value: constant
+lr_warmup_steps:
+ value: 0
+max_grad_norm:
+ value: 50
+max_train_steps:
+ value: 100000
+min_masking_rate:
+ value: 0
+mixed_precision:
+ value: bf16
+num_frames:
+ value: 17
+output_dir:
+ value: ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+pretrained_model_name_or_path:
+ value: dummy
+prompt_key:
+ value: null
+prompt_prefix:
+ value: null
+report_to:
+ value: wandb
+resolution:
+ value: 512
+resume_from_checkpoint:
+ value: null
+revision:
+ value: null
+scale_lr:
+ value: false
+seed:
+ value: 42
+split_vae_encode:
+ value: 1
+text_encoder_architecture:
+ value: umt5-xxl
+text_encoder_lora_alpha:
+ value: 32
+text_encoder_lora_r:
+ value: 16
+text_encoder_lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+text_encoder_use_lora:
+ value: false
+train_batch_size:
+ value: 1
+train_text_encoder:
+ value: false
+training_from_scratch:
+ value: true
+use_8bit_adam:
+ value: true
+use_ema:
+ value: false
+use_lora:
+ value: false
+use_precomputed_features:
+ value: false
+use_precomputed_video_only:
+ value: true
+validation_prompts:
+ value:
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+validation_steps:
+ value: 100
+variant:
+ value: null
+video_height:
+ value: 128
+video_tokenizer_model_id:
+ value: Cosmos-0.1-Tokenizer-DV4x8x8
+video_width:
+ value: 128
+wan_backbone_lr_ratio:
+ value: 0.2
+wan_pretrained_path:
+ value: /mnt/Wan2.1-T2V-1.3B
diff --git a/Meissonic/wandb/run-20251229_092754-hkolswde/files/requirements.txt b/Meissonic/wandb/run-20251229_092754-hkolswde/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..a15fd034e55caa7ed0e61870cf7b5d86e79963b3
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_092754-hkolswde/files/requirements.txt
@@ -0,0 +1,122 @@
+typing-inspection==0.4.2
+dill==0.4.0
+ffmpy==1.0.0
+xxhash==3.6.0
+partd==1.4.2
+brotli==1.2.0
+charset-normalizer==3.4.4
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+orjson==3.11.5
+numpy==2.4.0
+pydantic_core==2.41.5
+groovy==0.1.2
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+MarkupSafe==3.0.3
+protobuf==6.33.2
+nvidia-cusparselt-cu12==0.7.1
+locket==1.0.0
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+pydub==0.25.1
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+torchvision==0.24.1
+cloudpickle==3.1.2
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+open_clip_torch==3.2.0
+mdurl==0.1.2
+pandas==2.3.3
+toolz==1.1.0
+python-multipart==0.0.21
+ftfy==6.3.1
+platformdirs==4.5.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+typer==0.21.0
+lightning-utilities==0.15.2
+gradio_client==2.0.2
+wheel==0.45.1
+timm==1.0.22
+semantic-version==2.10.0
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+importlib_metadata==8.7.1
+httpcore==1.0.9
+fsspec==2025.10.0
+multidict==6.7.0
+regex==2025.11.3
+bitsandbytes==0.49.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+uvicorn==0.40.0
+packaging==25.0
+aiosignal==1.4.0
+nvidia-cuda-nvrtc-cu12==12.8.93
+networkx==3.6.1
+setuptools==80.9.0
+sympy==1.14.0
+torch==2.9.1
+nvidia-cuda-cupti-cu12==12.8.90
+gradio==6.2.0
+smmap==5.0.2
+safetensors==0.7.0
+gitdb==4.0.12
+safehttpx==0.1.7
+fastapi==0.128.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+pillow==12.0.0
+sentry-sdk==2.48.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+tzdata==2025.3
+nvidia-nvtx-cu12==12.8.90
+filelock==3.20.1
+markdown-it-py==4.0.0
+six==1.17.0
+starlette==0.50.0
+audioop-lts==0.2.2
+urllib3==2.6.2
+accelerate==1.12.0
+psutil==7.2.1
+diffusers==0.36.0
+annotated-doc==0.0.4
+zipp==3.23.0
+propcache==0.4.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+nvidia-curand-cu12==10.3.9.90
+datasets==4.4.2
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+aiofiles==24.1.0
+dask==2025.12.0
+yarl==1.22.0
+nvidia-cudnn-cu12==9.10.2.21
+tomlkit==0.13.3
diff --git a/Meissonic/wandb/run-20251229_092754-hkolswde/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_092754-hkolswde/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..5d667483319987e9e2e6bf230de37719e4e17f61
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_092754-hkolswde/files/wandb-metadata.json
@@ -0,0 +1,157 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.13.11",
+ "startedAt": "2025-12-29T09:27:54.387328Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "128",
+ "--video_width",
+ "128",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "1",
+ "--gradient_accumulation_steps",
+ "1",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/mei-video/bin/python3.13",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15584537948160"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "qek4phtgw47r5nb6ur2jyeakhg2xjsmd"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_092754-hkolswde/files/wandb-summary.json b/Meissonic/wandb/run-20251229_092754-hkolswde/files/wandb-summary.json
new file mode 100644
index 0000000000000000000000000000000000000000..d36db3b5803a02ddbdedb9c4a80ca513af26e4ff
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_092754-hkolswde/files/wandb-summary.json
@@ -0,0 +1 @@
+{"_wandb":{"runtime":45},"_runtime":45}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_092754-hkolswde/run-hkolswde.wandb b/Meissonic/wandb/run-20251229_092754-hkolswde/run-hkolswde.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..ad5a808e80ccf24cd4aaea1c8fe5547d1681622d
Binary files /dev/null and b/Meissonic/wandb/run-20251229_092754-hkolswde/run-hkolswde.wandb differ
diff --git a/Meissonic/wandb/run-20251229_093047-tjwhycdm/files/config.yaml b/Meissonic/wandb/run-20251229_093047-tjwhycdm/files/config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..fbe6a1ec62e0f0bf62480ed7ba6acd244c1f12de
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093047-tjwhycdm/files/config.yaml
@@ -0,0 +1,310 @@
+_wandb:
+ value:
+ cli_version: 0.23.1
+ e:
+ 24dmzc7sd9zzpl5g1sdza6vk0vkp4c4k:
+ args:
+ - --use_precomputed_video_only
+ - --features_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+ - --text_encoder_architecture
+ - umt5-xxl
+ - --wan_pretrained_path
+ - /mnt/Wan2.1-T2V-1.3B
+ - --training_from_scratch
+ - --pretrained_model_name_or_path
+ - dummy
+ - --wan_backbone_lr_ratio
+ - "0.2"
+ - --num_frames
+ - "17"
+ - --video_height
+ - "128"
+ - --video_width
+ - "128"
+ - --dataloader_num_workers
+ - "8"
+ - --video_tokenizer_model_id
+ - Cosmos-0.1-Tokenizer-DV4x8x8
+ - --instance_dataset
+ - OpenVid1MDataset
+ - --instance_data_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+ - --train_batch_size
+ - "1"
+ - --gradient_accumulation_steps
+ - "1"
+ - --learning_rate
+ - "3e-3"
+ - --max_train_steps
+ - "100000"
+ - --checkpointing_steps
+ - "500"
+ - --validation_steps
+ - "100"
+ - --logging_steps
+ - "10"
+ - --validation_prompts
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+ - --output_dir
+ - ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+ - --mixed_precision
+ - bf16
+ - --lr_scheduler
+ - constant
+ - --lr_warmup_steps
+ - "0"
+ - --use_8bit_adam
+ - --gradient_checkpointing
+ - --min_masking_rate
+ - "0.0"
+ - --cond_dropout_prob
+ - "0.0"
+ - --split_vae_encode
+ - "1"
+ - --allow_tf32
+ - --seed
+ - "42"
+ - --report_to
+ - wandb
+ codePath: train/train_mei_video.py
+ codePathLocal: train/train_mei_video.py
+ cpu_count: 48
+ cpu_count_logical: 96
+ cudaVersion: "12.8"
+ disk:
+ /:
+ total: "16650112278528"
+ used: "15584538075136"
+ email: jinbin5bai@gmail.com
+ executable: /home/ubuntu/miniconda3/envs/mei-video/bin/python3.13
+ git:
+ commit: 6819d374ef1b86bdedad373aab1121a89687e5cf
+ remote: https://github.com/viiika/Meissonic.git
+ gpu: NVIDIA A100-SXM4-40GB
+ gpu_count: 8
+ gpu_nvidia:
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-71102f28-cd17-57e7-6181-120bf743d23d
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-303ab142-3206-9a14-c758-58ab97d7510e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-efb2d1fc-1eed-653d-ed51-5273085154ba
+ host: ip-172-31-91-136
+ memory:
+ total: "1204521451520"
+ os: Linux-6.8.0-1027-aws-x86_64-with-glibc2.35
+ program: /mnt/Meissonic/train/train_mei_video.py
+ python: CPython 3.13.11
+ root: /mnt/Meissonic
+ startedAt: "2025-12-29T09:30:47.679442Z"
+ writerId: 24dmzc7sd9zzpl5g1sdza6vk0vkp4c4k
+ m: []
+ python_version: 3.13.11
+ t:
+ "1":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ - 105
+ "2":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ - 105
+ "4": 3.13.11
+ "5": 0.23.1
+ "6": 4.57.3
+ "12": 0.23.1
+ "13": linux-x86_64
+adam_beta1:
+ value: 0.9
+adam_beta2:
+ value: 0.999
+adam_epsilon:
+ value: 1e-08
+adam_weight_decay:
+ value: 0.01
+allow_tf32:
+ value: true
+checkpointing_steps:
+ value: 500
+checkpoints_total_limit:
+ value: null
+cond_dropout_prob:
+ value: 0
+dataloader_num_workers:
+ value: 8
+dataloader_prefetch_factor:
+ value: 2
+ema_decay:
+ value: 0.9999
+ema_update_after_step:
+ value: 0
+empty_embeds_path:
+ value: null
+features_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+freeze_wan_backbone:
+ value: false
+gradient_accumulation_steps:
+ value: 1
+gradient_checkpointing:
+ value: true
+image_key:
+ value: null
+instance_data_dataset:
+ value: null
+instance_data_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+instance_data_image:
+ value: null
+instance_dataset:
+ value: OpenVid1MDataset
+learning_rate:
+ value: 0.003
+logging_dir:
+ value: logs
+logging_steps:
+ value: 10
+lora_alpha:
+ value: 32
+lora_r:
+ value: 16
+lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+lr_scheduler:
+ value: constant
+lr_warmup_steps:
+ value: 0
+max_grad_norm:
+ value: 50
+max_train_steps:
+ value: 100000
+min_masking_rate:
+ value: 0
+mixed_precision:
+ value: bf16
+num_frames:
+ value: 17
+output_dir:
+ value: ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+pretrained_model_name_or_path:
+ value: dummy
+prompt_key:
+ value: null
+prompt_prefix:
+ value: null
+report_to:
+ value: wandb
+resolution:
+ value: 512
+resume_from_checkpoint:
+ value: null
+revision:
+ value: null
+scale_lr:
+ value: false
+seed:
+ value: 42
+split_vae_encode:
+ value: 1
+text_encoder_architecture:
+ value: umt5-xxl
+text_encoder_lora_alpha:
+ value: 32
+text_encoder_lora_r:
+ value: 16
+text_encoder_lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+text_encoder_use_lora:
+ value: false
+train_batch_size:
+ value: 1
+train_text_encoder:
+ value: false
+training_from_scratch:
+ value: true
+use_8bit_adam:
+ value: true
+use_ema:
+ value: false
+use_lora:
+ value: false
+use_precomputed_features:
+ value: false
+use_precomputed_video_only:
+ value: true
+validation_prompts:
+ value:
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+validation_steps:
+ value: 100
+variant:
+ value: null
+video_height:
+ value: 128
+video_tokenizer_model_id:
+ value: Cosmos-0.1-Tokenizer-DV4x8x8
+video_width:
+ value: 128
+wan_backbone_lr_ratio:
+ value: 0.2
+wan_pretrained_path:
+ value: /mnt/Wan2.1-T2V-1.3B
diff --git a/Meissonic/wandb/run-20251229_093047-tjwhycdm/files/requirements.txt b/Meissonic/wandb/run-20251229_093047-tjwhycdm/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..a15fd034e55caa7ed0e61870cf7b5d86e79963b3
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093047-tjwhycdm/files/requirements.txt
@@ -0,0 +1,122 @@
+typing-inspection==0.4.2
+dill==0.4.0
+ffmpy==1.0.0
+xxhash==3.6.0
+partd==1.4.2
+brotli==1.2.0
+charset-normalizer==3.4.4
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+orjson==3.11.5
+numpy==2.4.0
+pydantic_core==2.41.5
+groovy==0.1.2
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+MarkupSafe==3.0.3
+protobuf==6.33.2
+nvidia-cusparselt-cu12==0.7.1
+locket==1.0.0
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+pydub==0.25.1
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+torchvision==0.24.1
+cloudpickle==3.1.2
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+open_clip_torch==3.2.0
+mdurl==0.1.2
+pandas==2.3.3
+toolz==1.1.0
+python-multipart==0.0.21
+ftfy==6.3.1
+platformdirs==4.5.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+typer==0.21.0
+lightning-utilities==0.15.2
+gradio_client==2.0.2
+wheel==0.45.1
+timm==1.0.22
+semantic-version==2.10.0
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+importlib_metadata==8.7.1
+httpcore==1.0.9
+fsspec==2025.10.0
+multidict==6.7.0
+regex==2025.11.3
+bitsandbytes==0.49.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+uvicorn==0.40.0
+packaging==25.0
+aiosignal==1.4.0
+nvidia-cuda-nvrtc-cu12==12.8.93
+networkx==3.6.1
+setuptools==80.9.0
+sympy==1.14.0
+torch==2.9.1
+nvidia-cuda-cupti-cu12==12.8.90
+gradio==6.2.0
+smmap==5.0.2
+safetensors==0.7.0
+gitdb==4.0.12
+safehttpx==0.1.7
+fastapi==0.128.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+pillow==12.0.0
+sentry-sdk==2.48.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+tzdata==2025.3
+nvidia-nvtx-cu12==12.8.90
+filelock==3.20.1
+markdown-it-py==4.0.0
+six==1.17.0
+starlette==0.50.0
+audioop-lts==0.2.2
+urllib3==2.6.2
+accelerate==1.12.0
+psutil==7.2.1
+diffusers==0.36.0
+annotated-doc==0.0.4
+zipp==3.23.0
+propcache==0.4.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+nvidia-curand-cu12==10.3.9.90
+datasets==4.4.2
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+aiofiles==24.1.0
+dask==2025.12.0
+yarl==1.22.0
+nvidia-cudnn-cu12==9.10.2.21
+tomlkit==0.13.3
diff --git a/Meissonic/wandb/run-20251229_093047-tjwhycdm/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_093047-tjwhycdm/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..84d580b153c5079e7952b13d957d329c1ef17f55
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093047-tjwhycdm/files/wandb-metadata.json
@@ -0,0 +1,157 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.13.11",
+ "startedAt": "2025-12-29T09:30:47.679442Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "128",
+ "--video_width",
+ "128",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "1",
+ "--gradient_accumulation_steps",
+ "1",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/mei-video/bin/python3.13",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15584538075136"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "24dmzc7sd9zzpl5g1sdza6vk0vkp4c4k"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_093047-tjwhycdm/files/wandb-summary.json b/Meissonic/wandb/run-20251229_093047-tjwhycdm/files/wandb-summary.json
new file mode 100644
index 0000000000000000000000000000000000000000..c06148192ebeffb55e0aca087559968faedf510f
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093047-tjwhycdm/files/wandb-summary.json
@@ -0,0 +1 @@
+{"_wandb":{"runtime":78},"_runtime":78}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_093047-tjwhycdm/run-tjwhycdm.wandb b/Meissonic/wandb/run-20251229_093047-tjwhycdm/run-tjwhycdm.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..df97e0be126c82e3f76fe16d3515820c56e392c3
Binary files /dev/null and b/Meissonic/wandb/run-20251229_093047-tjwhycdm/run-tjwhycdm.wandb differ
diff --git a/Meissonic/wandb/run-20251229_093332-4lgcq9jf/files/config.yaml b/Meissonic/wandb/run-20251229_093332-4lgcq9jf/files/config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..219de3c10a01c14d9cb72e9e2ec34de106c60323
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093332-4lgcq9jf/files/config.yaml
@@ -0,0 +1,310 @@
+_wandb:
+ value:
+ cli_version: 0.23.1
+ e:
+ a7q3mqc5fpt8eof7ipn9yioxo8tydpnv:
+ args:
+ - --use_precomputed_video_only
+ - --features_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+ - --text_encoder_architecture
+ - umt5-xxl
+ - --wan_pretrained_path
+ - /mnt/Wan2.1-T2V-1.3B
+ - --training_from_scratch
+ - --pretrained_model_name_or_path
+ - dummy
+ - --wan_backbone_lr_ratio
+ - "0.2"
+ - --num_frames
+ - "17"
+ - --video_height
+ - "128"
+ - --video_width
+ - "128"
+ - --dataloader_num_workers
+ - "8"
+ - --video_tokenizer_model_id
+ - Cosmos-0.1-Tokenizer-DV4x8x8
+ - --instance_dataset
+ - OpenVid1MDataset
+ - --instance_data_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+ - --train_batch_size
+ - "1"
+ - --gradient_accumulation_steps
+ - "1"
+ - --learning_rate
+ - "3e-3"
+ - --max_train_steps
+ - "100000"
+ - --checkpointing_steps
+ - "500"
+ - --validation_steps
+ - "100"
+ - --logging_steps
+ - "10"
+ - --validation_prompts
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+ - --output_dir
+ - ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+ - --mixed_precision
+ - bf16
+ - --lr_scheduler
+ - constant
+ - --lr_warmup_steps
+ - "0"
+ - --use_8bit_adam
+ - --gradient_checkpointing
+ - --min_masking_rate
+ - "0.0"
+ - --cond_dropout_prob
+ - "0.0"
+ - --split_vae_encode
+ - "1"
+ - --allow_tf32
+ - --seed
+ - "42"
+ - --report_to
+ - wandb
+ codePath: train/train_mei_video.py
+ codePathLocal: train/train_mei_video.py
+ cpu_count: 48
+ cpu_count_logical: 96
+ cudaVersion: "12.8"
+ disk:
+ /:
+ total: "16650112278528"
+ used: "15584538189824"
+ email: jinbin5bai@gmail.com
+ executable: /home/ubuntu/miniconda3/envs/mei-video/bin/python3.13
+ git:
+ commit: 6819d374ef1b86bdedad373aab1121a89687e5cf
+ remote: https://github.com/viiika/Meissonic.git
+ gpu: NVIDIA A100-SXM4-40GB
+ gpu_count: 8
+ gpu_nvidia:
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-71102f28-cd17-57e7-6181-120bf743d23d
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-303ab142-3206-9a14-c758-58ab97d7510e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-efb2d1fc-1eed-653d-ed51-5273085154ba
+ host: ip-172-31-91-136
+ memory:
+ total: "1204521451520"
+ os: Linux-6.8.0-1027-aws-x86_64-with-glibc2.35
+ program: /mnt/Meissonic/train/train_mei_video.py
+ python: CPython 3.13.11
+ root: /mnt/Meissonic
+ startedAt: "2025-12-29T09:33:32.970685Z"
+ writerId: a7q3mqc5fpt8eof7ipn9yioxo8tydpnv
+ m: []
+ python_version: 3.13.11
+ t:
+ "1":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ - 105
+ "2":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ - 105
+ "4": 3.13.11
+ "5": 0.23.1
+ "6": 4.57.3
+ "12": 0.23.1
+ "13": linux-x86_64
+adam_beta1:
+ value: 0.9
+adam_beta2:
+ value: 0.999
+adam_epsilon:
+ value: 1e-08
+adam_weight_decay:
+ value: 0.01
+allow_tf32:
+ value: true
+checkpointing_steps:
+ value: 500
+checkpoints_total_limit:
+ value: null
+cond_dropout_prob:
+ value: 0
+dataloader_num_workers:
+ value: 8
+dataloader_prefetch_factor:
+ value: 2
+ema_decay:
+ value: 0.9999
+ema_update_after_step:
+ value: 0
+empty_embeds_path:
+ value: null
+features_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+freeze_wan_backbone:
+ value: false
+gradient_accumulation_steps:
+ value: 1
+gradient_checkpointing:
+ value: true
+image_key:
+ value: null
+instance_data_dataset:
+ value: null
+instance_data_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+instance_data_image:
+ value: null
+instance_dataset:
+ value: OpenVid1MDataset
+learning_rate:
+ value: 0.003
+logging_dir:
+ value: logs
+logging_steps:
+ value: 10
+lora_alpha:
+ value: 32
+lora_r:
+ value: 16
+lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+lr_scheduler:
+ value: constant
+lr_warmup_steps:
+ value: 0
+max_grad_norm:
+ value: 50
+max_train_steps:
+ value: 100000
+min_masking_rate:
+ value: 0
+mixed_precision:
+ value: bf16
+num_frames:
+ value: 17
+output_dir:
+ value: ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+pretrained_model_name_or_path:
+ value: dummy
+prompt_key:
+ value: null
+prompt_prefix:
+ value: null
+report_to:
+ value: wandb
+resolution:
+ value: 512
+resume_from_checkpoint:
+ value: null
+revision:
+ value: null
+scale_lr:
+ value: false
+seed:
+ value: 42
+split_vae_encode:
+ value: 1
+text_encoder_architecture:
+ value: umt5-xxl
+text_encoder_lora_alpha:
+ value: 32
+text_encoder_lora_r:
+ value: 16
+text_encoder_lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+text_encoder_use_lora:
+ value: false
+train_batch_size:
+ value: 1
+train_text_encoder:
+ value: false
+training_from_scratch:
+ value: true
+use_8bit_adam:
+ value: true
+use_ema:
+ value: false
+use_lora:
+ value: false
+use_precomputed_features:
+ value: false
+use_precomputed_video_only:
+ value: true
+validation_prompts:
+ value:
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+validation_steps:
+ value: 100
+variant:
+ value: null
+video_height:
+ value: 128
+video_tokenizer_model_id:
+ value: Cosmos-0.1-Tokenizer-DV4x8x8
+video_width:
+ value: 128
+wan_backbone_lr_ratio:
+ value: 0.2
+wan_pretrained_path:
+ value: /mnt/Wan2.1-T2V-1.3B
diff --git a/Meissonic/wandb/run-20251229_093332-4lgcq9jf/files/requirements.txt b/Meissonic/wandb/run-20251229_093332-4lgcq9jf/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..a15fd034e55caa7ed0e61870cf7b5d86e79963b3
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093332-4lgcq9jf/files/requirements.txt
@@ -0,0 +1,122 @@
+typing-inspection==0.4.2
+dill==0.4.0
+ffmpy==1.0.0
+xxhash==3.6.0
+partd==1.4.2
+brotli==1.2.0
+charset-normalizer==3.4.4
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+orjson==3.11.5
+numpy==2.4.0
+pydantic_core==2.41.5
+groovy==0.1.2
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+MarkupSafe==3.0.3
+protobuf==6.33.2
+nvidia-cusparselt-cu12==0.7.1
+locket==1.0.0
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+pydub==0.25.1
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+torchvision==0.24.1
+cloudpickle==3.1.2
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+open_clip_torch==3.2.0
+mdurl==0.1.2
+pandas==2.3.3
+toolz==1.1.0
+python-multipart==0.0.21
+ftfy==6.3.1
+platformdirs==4.5.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+typer==0.21.0
+lightning-utilities==0.15.2
+gradio_client==2.0.2
+wheel==0.45.1
+timm==1.0.22
+semantic-version==2.10.0
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+importlib_metadata==8.7.1
+httpcore==1.0.9
+fsspec==2025.10.0
+multidict==6.7.0
+regex==2025.11.3
+bitsandbytes==0.49.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+uvicorn==0.40.0
+packaging==25.0
+aiosignal==1.4.0
+nvidia-cuda-nvrtc-cu12==12.8.93
+networkx==3.6.1
+setuptools==80.9.0
+sympy==1.14.0
+torch==2.9.1
+nvidia-cuda-cupti-cu12==12.8.90
+gradio==6.2.0
+smmap==5.0.2
+safetensors==0.7.0
+gitdb==4.0.12
+safehttpx==0.1.7
+fastapi==0.128.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+pillow==12.0.0
+sentry-sdk==2.48.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+tzdata==2025.3
+nvidia-nvtx-cu12==12.8.90
+filelock==3.20.1
+markdown-it-py==4.0.0
+six==1.17.0
+starlette==0.50.0
+audioop-lts==0.2.2
+urllib3==2.6.2
+accelerate==1.12.0
+psutil==7.2.1
+diffusers==0.36.0
+annotated-doc==0.0.4
+zipp==3.23.0
+propcache==0.4.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+nvidia-curand-cu12==10.3.9.90
+datasets==4.4.2
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+aiofiles==24.1.0
+dask==2025.12.0
+yarl==1.22.0
+nvidia-cudnn-cu12==9.10.2.21
+tomlkit==0.13.3
diff --git a/Meissonic/wandb/run-20251229_093332-4lgcq9jf/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_093332-4lgcq9jf/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..af975273a156cf2c906018605c429aa2f436943c
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093332-4lgcq9jf/files/wandb-metadata.json
@@ -0,0 +1,157 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.13.11",
+ "startedAt": "2025-12-29T09:33:32.970685Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "128",
+ "--video_width",
+ "128",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "1",
+ "--gradient_accumulation_steps",
+ "1",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/mei-video/bin/python3.13",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15584538189824"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "a7q3mqc5fpt8eof7ipn9yioxo8tydpnv"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_093332-4lgcq9jf/files/wandb-summary.json b/Meissonic/wandb/run-20251229_093332-4lgcq9jf/files/wandb-summary.json
new file mode 100644
index 0000000000000000000000000000000000000000..36b39aeda94abcc7cd91a61f0f9fbfff2e31c16a
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093332-4lgcq9jf/files/wandb-summary.json
@@ -0,0 +1 @@
+{"_runtime":71,"_wandb":{"runtime":71}}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_093332-4lgcq9jf/run-4lgcq9jf.wandb b/Meissonic/wandb/run-20251229_093332-4lgcq9jf/run-4lgcq9jf.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..79025c0dfede9b817058d6e74f930619a56afaea
Binary files /dev/null and b/Meissonic/wandb/run-20251229_093332-4lgcq9jf/run-4lgcq9jf.wandb differ
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/config.yaml b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..6bf790f39da16dd1a94c1e9b59e7ed0464b1f2fb
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/config.yaml
@@ -0,0 +1,312 @@
+_wandb:
+ value:
+ cli_version: 0.23.1
+ e:
+ vcdq559fpl4a8h6xzf7kxm6bwt39s12v:
+ args:
+ - --use_precomputed_video_only
+ - --features_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+ - --text_encoder_architecture
+ - umt5-xxl
+ - --wan_pretrained_path
+ - /mnt/Wan2.1-T2V-1.3B
+ - --training_from_scratch
+ - --pretrained_model_name_or_path
+ - dummy
+ - --wan_backbone_lr_ratio
+ - "0.2"
+ - --num_frames
+ - "17"
+ - --video_height
+ - "128"
+ - --video_width
+ - "128"
+ - --dataloader_num_workers
+ - "8"
+ - --video_tokenizer_model_id
+ - Cosmos-0.1-Tokenizer-DV4x8x8
+ - --instance_dataset
+ - OpenVid1MDataset
+ - --instance_data_dir
+ - /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+ - --train_batch_size
+ - "1"
+ - --gradient_accumulation_steps
+ - "1"
+ - --learning_rate
+ - "3e-3"
+ - --max_train_steps
+ - "100000"
+ - --checkpointing_steps
+ - "500"
+ - --validation_steps
+ - "100"
+ - --logging_steps
+ - "10"
+ - --validation_prompts
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+ - --output_dir
+ - ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+ - --mixed_precision
+ - bf16
+ - --lr_scheduler
+ - constant
+ - --lr_warmup_steps
+ - "0"
+ - --use_8bit_adam
+ - --gradient_checkpointing
+ - --min_masking_rate
+ - "0.0"
+ - --cond_dropout_prob
+ - "0.0"
+ - --split_vae_encode
+ - "1"
+ - --allow_tf32
+ - --seed
+ - "42"
+ - --report_to
+ - wandb
+ codePath: train/train_mei_video.py
+ codePathLocal: train/train_mei_video.py
+ cpu_count: 48
+ cpu_count_logical: 96
+ cudaVersion: "12.8"
+ disk:
+ /:
+ total: "16650112278528"
+ used: "15584538329088"
+ email: jinbin5bai@gmail.com
+ executable: /home/ubuntu/miniconda3/envs/mei-video/bin/python3.13
+ git:
+ commit: 6819d374ef1b86bdedad373aab1121a89687e5cf
+ remote: https://github.com/viiika/Meissonic.git
+ gpu: NVIDIA A100-SXM4-40GB
+ gpu_count: 8
+ gpu_nvidia:
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-71102f28-cd17-57e7-6181-120bf743d23d
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-303ab142-3206-9a14-c758-58ab97d7510e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e
+ - architecture: Ampere
+ cudaCores: 6912
+ memoryTotal: "42949672960"
+ name: NVIDIA A100-SXM4-40GB
+ uuid: GPU-efb2d1fc-1eed-653d-ed51-5273085154ba
+ host: ip-172-31-91-136
+ memory:
+ total: "1204521451520"
+ os: Linux-6.8.0-1027-aws-x86_64-with-glibc2.35
+ program: /mnt/Meissonic/train/train_mei_video.py
+ python: CPython 3.13.11
+ root: /mnt/Meissonic
+ startedAt: "2025-12-29T09:35:00.408966Z"
+ writerId: vcdq559fpl4a8h6xzf7kxm6bwt39s12v
+ m: []
+ python_version: 3.13.11
+ t:
+ "1":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ - 105
+ "2":
+ - 1
+ - 11
+ - 41
+ - 49
+ - 51
+ - 71
+ - 83
+ - 98
+ - 105
+ "3":
+ - 61
+ "4": 3.13.11
+ "5": 0.23.1
+ "6": 4.57.3
+ "12": 0.23.1
+ "13": linux-x86_64
+adam_beta1:
+ value: 0.9
+adam_beta2:
+ value: 0.999
+adam_epsilon:
+ value: 1e-08
+adam_weight_decay:
+ value: 0.01
+allow_tf32:
+ value: true
+checkpointing_steps:
+ value: 500
+checkpoints_total_limit:
+ value: null
+cond_dropout_prob:
+ value: 0
+dataloader_num_workers:
+ value: 8
+dataloader_prefetch_factor:
+ value: 2
+ema_decay:
+ value: 0.9999
+ema_update_after_step:
+ value: 0
+empty_embeds_path:
+ value: null
+features_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128
+freeze_wan_backbone:
+ value: false
+gradient_accumulation_steps:
+ value: 1
+gradient_checkpointing:
+ value: true
+image_key:
+ value: null
+instance_data_dataset:
+ value: null
+instance_data_dir:
+ value: /mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv
+instance_data_image:
+ value: null
+instance_dataset:
+ value: OpenVid1MDataset
+learning_rate:
+ value: 0.003
+logging_dir:
+ value: logs
+logging_steps:
+ value: 10
+lora_alpha:
+ value: 32
+lora_r:
+ value: 16
+lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+lr_scheduler:
+ value: constant
+lr_warmup_steps:
+ value: 0
+max_grad_norm:
+ value: 50
+max_train_steps:
+ value: 100000
+min_masking_rate:
+ value: 0
+mixed_precision:
+ value: bf16
+num_frames:
+ value: 17
+output_dir:
+ value: ./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3
+pretrained_model_name_or_path:
+ value: dummy
+prompt_key:
+ value: null
+prompt_prefix:
+ value: null
+report_to:
+ value: wandb
+resolution:
+ value: 512
+resume_from_checkpoint:
+ value: null
+revision:
+ value: null
+scale_lr:
+ value: false
+seed:
+ value: 42
+split_vae_encode:
+ value: 1
+text_encoder_architecture:
+ value: umt5-xxl
+text_encoder_lora_alpha:
+ value: 32
+text_encoder_lora_r:
+ value: 16
+text_encoder_lora_target_modules:
+ value:
+ - to_q
+ - to_k
+ - to_v
+text_encoder_use_lora:
+ value: false
+train_batch_size:
+ value: 1
+train_text_encoder:
+ value: false
+training_from_scratch:
+ value: true
+use_8bit_adam:
+ value: true
+use_ema:
+ value: false
+use_lora:
+ value: false
+use_precomputed_features:
+ value: false
+use_precomputed_video_only:
+ value: true
+validation_prompts:
+ value:
+ - a cat playing
+ - a girl walking
+ - The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.
+ - The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.
+validation_steps:
+ value: 100
+variant:
+ value: null
+video_height:
+ value: 128
+video_tokenizer_model_id:
+ value: Cosmos-0.1-Tokenizer-DV4x8x8
+video_width:
+ value: 128
+wan_backbone_lr_ratio:
+ value: 0.2
+wan_pretrained_path:
+ value: /mnt/Wan2.1-T2V-1.3B
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_100_0ef22762a3ab27fb03dc.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_100_0ef22762a3ab27fb03dc.png
new file mode 100644
index 0000000000000000000000000000000000000000..371ff5912a51e128fbeceffd1f3f3eaa832c6a9b
Binary files /dev/null and b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_100_0ef22762a3ab27fb03dc.png differ
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_100_308ba870124404b0fb9b.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_100_308ba870124404b0fb9b.png
new file mode 100644
index 0000000000000000000000000000000000000000..a6b9edaead67a3099ca22dfbe07b93d4440d0807
Binary files /dev/null and b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_100_308ba870124404b0fb9b.png differ
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_100_ad7ff2599e88cf9a7681.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_100_ad7ff2599e88cf9a7681.png
new file mode 100644
index 0000000000000000000000000000000000000000..bd6255eb481309a861cb49e49327602bec58e745
Binary files /dev/null and b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_100_ad7ff2599e88cf9a7681.png differ
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_100_d538d6898e00f9eedeb3.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_100_d538d6898e00f9eedeb3.png
new file mode 100644
index 0000000000000000000000000000000000000000..3727f8d04b582efc0a6d9836461e788e50ee16ee
Binary files /dev/null and b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_100_d538d6898e00f9eedeb3.png differ
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_200_19744c7a8da530e62cea.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_200_19744c7a8da530e62cea.png
new file mode 100644
index 0000000000000000000000000000000000000000..b438bb7b5f3cc61673e1546790bdef36c4b290e1
Binary files /dev/null and b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_200_19744c7a8da530e62cea.png differ
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_200_265c8c33d9cde256e090.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_200_265c8c33d9cde256e090.png
new file mode 100644
index 0000000000000000000000000000000000000000..c2d785d5b1a3baebe3f77347079c9ac0c3028113
Binary files /dev/null and b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_200_265c8c33d9cde256e090.png differ
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_200_90b9bcd50f8469d941b1.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_200_90b9bcd50f8469d941b1.png
new file mode 100644
index 0000000000000000000000000000000000000000..f4b68ba4987c632c082c64509a745128345e8f2b
Binary files /dev/null and b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_200_90b9bcd50f8469d941b1.png differ
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_200_c91d5c3f300da2b3edcf.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_200_c91d5c3f300da2b3edcf.png
new file mode 100644
index 0000000000000000000000000000000000000000..0a253953cfbd5bd4aab4eb23f9ecd8e2ae973a8a
Binary files /dev/null and b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_200_c91d5c3f300da2b3edcf.png differ
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_300_3b54399d1898c15ae654.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_300_3b54399d1898c15ae654.png
new file mode 100644
index 0000000000000000000000000000000000000000..d8e5517ca4eb1daae1425667cc2815a7a3adf854
Binary files /dev/null and b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_300_3b54399d1898c15ae654.png differ
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_300_73f84f623bd451cafbb4.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_300_73f84f623bd451cafbb4.png
new file mode 100644
index 0000000000000000000000000000000000000000..ef3da8aa10f287967cb80072fc6b6b72732b3b9c
Binary files /dev/null and b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_300_73f84f623bd451cafbb4.png differ
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_300_c18353e13948a8684c94.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_300_c18353e13948a8684c94.png
new file mode 100644
index 0000000000000000000000000000000000000000..412289ba34c097d7c698311ba8d4be091e3671f9
Binary files /dev/null and b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_300_c18353e13948a8684c94.png differ
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_300_e3413b9cbd48a23f9e85.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_300_e3413b9cbd48a23f9e85.png
new file mode 100644
index 0000000000000000000000000000000000000000..62274a0afb9219a3463fb84b7f2dca0bc8ac3e78
Binary files /dev/null and b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_300_e3413b9cbd48a23f9e85.png differ
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_400_5dcdc12cd2058ba5a9ee.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_400_5dcdc12cd2058ba5a9ee.png
new file mode 100644
index 0000000000000000000000000000000000000000..41d30da4db7eed110e0dff60ad9535eaa9700158
Binary files /dev/null and b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_400_5dcdc12cd2058ba5a9ee.png differ
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_400_7e274e88e8caf65dc595.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_400_7e274e88e8caf65dc595.png
new file mode 100644
index 0000000000000000000000000000000000000000..659385e376f87d3a7ddc3188b584f05d1659cd49
Binary files /dev/null and b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_400_7e274e88e8caf65dc595.png differ
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_400_a14b1826b3dd5a9eba69.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_400_a14b1826b3dd5a9eba69.png
new file mode 100644
index 0000000000000000000000000000000000000000..07caf99cc0807206280020348fd443407a79e8a8
Binary files /dev/null and b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_400_a14b1826b3dd5a9eba69.png differ
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_400_daf2d6d4a92c1732b441.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_400_daf2d6d4a92c1732b441.png
new file mode 100644
index 0000000000000000000000000000000000000000..e008954c6c93054395fee288aa001cecb04fe5e6
Binary files /dev/null and b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_first_frame_400_daf2d6d4a92c1732b441.png differ
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_100_2420e35dd1e83c0660b9.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_100_2420e35dd1e83c0660b9.png
new file mode 100644
index 0000000000000000000000000000000000000000..3e9423148c760d808327a1f0a2ae12522d4afdf8
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_100_2420e35dd1e83c0660b9.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2420e35dd1e83c0660b997b506db4a7f64f409b038a5bba55bf6861a554efdc7
+size 341167
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_100_2b1ecd238e72a9b523a2.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_100_2b1ecd238e72a9b523a2.png
new file mode 100644
index 0000000000000000000000000000000000000000..22c0b761b943a9fa9bc650dcfc6e6ae7d1eb19a8
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_100_2b1ecd238e72a9b523a2.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2b1ecd238e72a9b523a272f0b1cc1c9b1b26547023efff5a76fe4c7c50aa5ae6
+size 300392
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_100_364542bf705de160ac0e.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_100_364542bf705de160ac0e.png
new file mode 100644
index 0000000000000000000000000000000000000000..23ab87cb477ac3204de0f7efa84f6ccd3de53f45
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_100_364542bf705de160ac0e.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:364542bf705de160ac0ee79bc5203066606da6ba6075613fe7a660864e4daa2a
+size 337437
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_100_856cc2d108cc40e00579.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_100_856cc2d108cc40e00579.png
new file mode 100644
index 0000000000000000000000000000000000000000..e8d3a9c808a5aea5538df48ab1d8ef7040640e51
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_100_856cc2d108cc40e00579.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:856cc2d108cc40e005793bae9c1c72d9aa0d9bfae67d61f91ad076e605b69355
+size 307269
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_200_4d12cf67fd99430d1276.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_200_4d12cf67fd99430d1276.png
new file mode 100644
index 0000000000000000000000000000000000000000..21e1acdb2b88edb29c2167c7f4cbe5ad369f47d4
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_200_4d12cf67fd99430d1276.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4d12cf67fd99430d127643000bff8510c66549270ca767501eb78a9e7b687d2f
+size 422606
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_200_b63354772314936f3223.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_200_b63354772314936f3223.png
new file mode 100644
index 0000000000000000000000000000000000000000..8eabe640522e6c9472882c8c44f85ccbdd58ec91
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_200_b63354772314936f3223.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b63354772314936f322363f8c7a04f02a7b56f1cc4284bf3bf763ac2f3e7c866
+size 433378
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_200_b864896c48f53b3f28b2.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_200_b864896c48f53b3f28b2.png
new file mode 100644
index 0000000000000000000000000000000000000000..e4625b01cdf11daee93c4ec6c3f2167aded7ed81
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_200_b864896c48f53b3f28b2.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b864896c48f53b3f28b274effeeb4d0c1fc73d65ea8321281529756f5b577cb1
+size 406799
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_200_ed8fa5609e3346e45657.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_200_ed8fa5609e3346e45657.png
new file mode 100644
index 0000000000000000000000000000000000000000..060ec5c7219476d37b71dbe7e32270d2e38fc927
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_200_ed8fa5609e3346e45657.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ed8fa5609e3346e456574dfa7dd4cca969cf11372b5e670cf22783f9f9368fe4
+size 450980
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_300_05e3c676f628aa578303.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_300_05e3c676f628aa578303.png
new file mode 100644
index 0000000000000000000000000000000000000000..c758567d501b232d74430157c8bec7b714c2b3fb
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_300_05e3c676f628aa578303.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:05e3c676f628aa57830344f496e7e82c5cb498cace621e57f0c518105e5a20c4
+size 341846
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_300_24d3f6d3dd6f2562c8c3.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_300_24d3f6d3dd6f2562c8c3.png
new file mode 100644
index 0000000000000000000000000000000000000000..cc89ebed3336a2da0b3b0a7915549042cc2be341
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_300_24d3f6d3dd6f2562c8c3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:24d3f6d3dd6f2562c8c3fc52e57f26ff6aef0ce97d9756fc6c64906e09e7106f
+size 367123
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_300_8cc2ceb3c3ab491db83c.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_300_8cc2ceb3c3ab491db83c.png
new file mode 100644
index 0000000000000000000000000000000000000000..0bbde0ff5a76aa5174cd22bccc7df80151dc0d6f
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_300_8cc2ceb3c3ab491db83c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8cc2ceb3c3ab491db83cfc97d106cab180380ebecee19d5151d1ab70eb68ef15
+size 336819
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_300_df871bf19088b3f6714c.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_300_df871bf19088b3f6714c.png
new file mode 100644
index 0000000000000000000000000000000000000000..d5a2bb0c7093aa5bc7f8f16be76416f10926c532
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_300_df871bf19088b3f6714c.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:df871bf19088b3f6714cc29678cfccadc184579b829850ab12a691e936293017
+size 394279
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_400_053d1da246a9155b89d3.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_400_053d1da246a9155b89d3.png
new file mode 100644
index 0000000000000000000000000000000000000000..764cf7ad349a524dbd9d917be4b74fd11e05a8a0
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_400_053d1da246a9155b89d3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:053d1da246a9155b89d3117d33ddf47552bd72d464f742afe62759ab3d43d1e3
+size 419777
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_400_3a14e8f83afe8766827a.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_400_3a14e8f83afe8766827a.png
new file mode 100644
index 0000000000000000000000000000000000000000..b4188b98c4e1f1f6e7b48c78ee6f891e84fa7f19
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_400_3a14e8f83afe8766827a.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3a14e8f83afe8766827a3f23218980a95de0d87ce5109228aed6e3acf5ed3d85
+size 434821
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_400_49f5c1d1b56568ada77d.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_400_49f5c1d1b56568ada77d.png
new file mode 100644
index 0000000000000000000000000000000000000000..ceb94c122a2f201c370653a8a007f69da6f1a34c
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_400_49f5c1d1b56568ada77d.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:49f5c1d1b56568ada77df5cf5e86dcc1d939bacd059d3902acd2006b08c12446
+size 434327
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_400_976d3a1db399a03059c3.png b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_400_976d3a1db399a03059c3.png
new file mode 100644
index 0000000000000000000000000000000000000000..1c7d6ea8a3cf238ed6544fddfccee25b8c709356
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/media/images/generated_videos_grid_400_976d3a1db399a03059c3.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:976d3a1db399a03059c33ef0ff905764962941de580c5ee03df3dceb51f3148b
+size 431591
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/requirements.txt b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..a15fd034e55caa7ed0e61870cf7b5d86e79963b3
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/requirements.txt
@@ -0,0 +1,122 @@
+typing-inspection==0.4.2
+dill==0.4.0
+ffmpy==1.0.0
+xxhash==3.6.0
+partd==1.4.2
+brotli==1.2.0
+charset-normalizer==3.4.4
+tokenizers==0.22.1
+aiohappyeyeballs==2.6.1
+python-dateutil==2.9.0.post0
+pyarrow==22.0.0
+annotated-types==0.7.0
+GitPython==3.1.45
+rich==14.2.0
+nvidia-cufile-cu12==1.13.1.3
+nvidia-nvshmem-cu12==3.3.20
+orjson==3.11.5
+numpy==2.4.0
+pydantic_core==2.41.5
+groovy==0.1.2
+peft==0.18.0
+typing_extensions==4.15.0
+wcwidth==0.2.14
+MarkupSafe==3.0.3
+protobuf==6.33.2
+nvidia-cusparselt-cu12==0.7.1
+locket==1.0.0
+PyYAML==6.0.3
+nvidia-nvjitlink-cu12==12.8.93
+pytorch-lightning==2.6.0
+frozenlist==1.8.0
+pydub==0.25.1
+huggingface-hub==0.36.0
+Pygments==2.19.2
+aiohttp==3.13.2
+torchvision==0.24.1
+cloudpickle==3.1.2
+wandb==0.23.1
+tqdm==4.67.1
+httpx==0.28.1
+open_clip_torch==3.2.0
+mdurl==0.1.2
+pandas==2.3.3
+toolz==1.1.0
+python-multipart==0.0.21
+ftfy==6.3.1
+platformdirs==4.5.1
+transformers==4.57.3
+requests==2.32.5
+pytz==2025.2
+Jinja2==3.1.6
+click==8.3.1
+attrs==25.4.0
+hf-xet==1.2.0
+shellingham==1.5.4
+nvidia-nccl-cu12==2.27.5
+nvidia-cuda-runtime-cu12==12.8.90
+typer==0.21.0
+lightning-utilities==0.15.2
+gradio_client==2.0.2
+wheel==0.45.1
+timm==1.0.22
+semantic-version==2.10.0
+triton==3.5.1
+nvidia-cublas-cu12==12.8.4.1
+importlib_metadata==8.7.1
+httpcore==1.0.9
+fsspec==2025.10.0
+multidict==6.7.0
+regex==2025.11.3
+bitsandbytes==0.49.0
+anyio==4.12.0
+nvidia-cusolver-cu12==11.7.3.90
+torchmetrics==1.8.2
+uvicorn==0.40.0
+packaging==25.0
+aiosignal==1.4.0
+nvidia-cuda-nvrtc-cu12==12.8.93
+networkx==3.6.1
+setuptools==80.9.0
+sympy==1.14.0
+torch==2.9.1
+nvidia-cuda-cupti-cu12==12.8.90
+gradio==6.2.0
+smmap==5.0.2
+safetensors==0.7.0
+gitdb==4.0.12
+safehttpx==0.1.7
+fastapi==0.128.0
+nvidia-cusparse-cu12==12.5.8.93
+multiprocess==0.70.18
+pillow==12.0.0
+sentry-sdk==2.48.0
+h11==0.16.0
+certifi==2025.11.12
+idna==3.11
+tzdata==2025.3
+nvidia-nvtx-cu12==12.8.90
+filelock==3.20.1
+markdown-it-py==4.0.0
+six==1.17.0
+starlette==0.50.0
+audioop-lts==0.2.2
+urllib3==2.6.2
+accelerate==1.12.0
+psutil==7.2.1
+diffusers==0.36.0
+annotated-doc==0.0.4
+zipp==3.23.0
+propcache==0.4.1
+mpmath==1.3.0
+sentencepiece==0.2.1
+nvidia-curand-cu12==10.3.9.90
+datasets==4.4.2
+nvidia-cufft-cu12==11.3.3.83
+pydantic==2.12.5
+pip==25.3
+aiofiles==24.1.0
+dask==2025.12.0
+yarl==1.22.0
+nvidia-cudnn-cu12==9.10.2.21
+tomlkit==0.13.3
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/wandb-metadata.json b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/wandb-metadata.json
new file mode 100644
index 0000000000000000000000000000000000000000..bdbf044dc4ee8f282dd2c84fb8cc5fb73684acd3
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/wandb-metadata.json
@@ -0,0 +1,157 @@
+{
+ "os": "Linux-6.8.0-1027-aws-x86_64-with-glibc2.35",
+ "python": "CPython 3.13.11",
+ "startedAt": "2025-12-29T09:35:00.408966Z",
+ "args": [
+ "--use_precomputed_video_only",
+ "--features_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/extracted_features_17_128_128",
+ "--text_encoder_architecture",
+ "umt5-xxl",
+ "--wan_pretrained_path",
+ "/mnt/Wan2.1-T2V-1.3B",
+ "--training_from_scratch",
+ "--pretrained_model_name_or_path",
+ "dummy",
+ "--wan_backbone_lr_ratio",
+ "0.2",
+ "--num_frames",
+ "17",
+ "--video_height",
+ "128",
+ "--video_width",
+ "128",
+ "--dataloader_num_workers",
+ "8",
+ "--video_tokenizer_model_id",
+ "Cosmos-0.1-Tokenizer-DV4x8x8",
+ "--instance_dataset",
+ "OpenVid1MDataset",
+ "--instance_data_dir",
+ "/mnt/VideoGen/dataset/OpenVid1M/video_reorg/OpenVid1M_reorganized.csv",
+ "--train_batch_size",
+ "1",
+ "--gradient_accumulation_steps",
+ "1",
+ "--learning_rate",
+ "3e-3",
+ "--max_train_steps",
+ "100000",
+ "--checkpointing_steps",
+ "500",
+ "--validation_steps",
+ "100",
+ "--logging_steps",
+ "10",
+ "--validation_prompts",
+ "a cat playing",
+ "a girl walking",
+ "The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.",
+ "The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner.",
+ "--output_dir",
+ "./output_256x256_17f_2*4bs_4*8*8vqvae_0_2_ratio_lr3e-3",
+ "--mixed_precision",
+ "bf16",
+ "--lr_scheduler",
+ "constant",
+ "--lr_warmup_steps",
+ "0",
+ "--use_8bit_adam",
+ "--gradient_checkpointing",
+ "--min_masking_rate",
+ "0.0",
+ "--cond_dropout_prob",
+ "0.0",
+ "--split_vae_encode",
+ "1",
+ "--allow_tf32",
+ "--seed",
+ "42",
+ "--report_to",
+ "wandb"
+ ],
+ "program": "/mnt/Meissonic/train/train_mei_video.py",
+ "codePath": "train/train_mei_video.py",
+ "codePathLocal": "train/train_mei_video.py",
+ "git": {
+ "remote": "https://github.com/viiika/Meissonic.git",
+ "commit": "6819d374ef1b86bdedad373aab1121a89687e5cf"
+ },
+ "email": "jinbin5bai@gmail.com",
+ "root": "/mnt/Meissonic",
+ "host": "ip-172-31-91-136",
+ "executable": "/home/ubuntu/miniconda3/envs/mei-video/bin/python3.13",
+ "cpu_count": 48,
+ "cpu_count_logical": 96,
+ "gpu": "NVIDIA A100-SXM4-40GB",
+ "gpu_count": 8,
+ "disk": {
+ "/": {
+ "total": "16650112278528",
+ "used": "15584538329088"
+ }
+ },
+ "memory": {
+ "total": "1204521451520"
+ },
+ "gpu_nvidia": [
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-54a50f05-7a41-8b8e-59c5-e1774ec42215"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-71102f28-cd17-57e7-6181-120bf743d23d"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-57dfac44-bb50-f9b6-1534-27fbe79dfd87"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-f55652c0-bdaf-e7bb-a876-8fce14c3f879"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-303ab142-3206-9a14-c758-58ab97d7510e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-1adf5c34-24d0-c5e2-b33b-783100bbd6c3"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-5b4a0e50-96a5-74bd-f595-14de5614cc6e"
+ },
+ {
+ "name": "NVIDIA A100-SXM4-40GB",
+ "memoryTotal": "42949672960",
+ "cudaCores": 6912,
+ "architecture": "Ampere",
+ "uuid": "GPU-efb2d1fc-1eed-653d-ed51-5273085154ba"
+ }
+ ],
+ "cudaVersion": "12.8",
+ "writerId": "vcdq559fpl4a8h6xzf7kxm6bwt39s12v"
+}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/wandb-summary.json b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/wandb-summary.json
new file mode 100644
index 0000000000000000000000000000000000000000..9dcca013a028c9c21d84e08e974a08f15e51ae43
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093500-yyrdgepk/files/wandb-summary.json
@@ -0,0 +1 @@
+{"generated_videos_first_frame":{"count":4,"filenames":["media/images/generated_videos_first_frame_400_daf2d6d4a92c1732b441.png","media/images/generated_videos_first_frame_400_7e274e88e8caf65dc595.png","media/images/generated_videos_first_frame_400_5dcdc12cd2058ba5a9ee.png","media/images/generated_videos_first_frame_400_a14b1826b3dd5a9eba69.png"],"captions":["a cat playing","a girl walking","The video features a man named David Schultz from Hamline University. He is dressed in a suit and tie, standing in front of a building with a tree in the background. The man appears to be speaking or presenting, as suggested by the context of the image. The style of the video is likely informative or educational, given the context of the man's attire and the setting. The video may be part of a news segment or a lecture series, as indicated by the man's professional appearance and the presence of a building that could be a university or academic institution.","The video captures the interior of a car at a car show. The car features a striking orange and black color scheme, with the seats upholstered in orange leather and the door panels in black leather. The car's interior is well-lit, highlighting the details of the upholstery and the design of the door panels. The car is on display, with people walking around and observing it. The car show setting is bustling with activity, with other cars and people visible in the background. The video is a close-up shot of the car's interior, focusing on the details of the upholstery and the design of the door panels. The style of the video is realistic, capturing the car's interior in a clear and detailed manner."],"_type":"images/separated","width":128,"height":128,"format":"png"},"generated_videos_grid":{"width":522,"height":522,"caption":"video_3_grid","sha256":"053d1da246a9155b89d3117d33ddf47552bd72d464f742afe62759ab3d43d1e3","format":"png","_type":"image-file","size":419777,"path":"media/images/generated_videos_grid_400_053d1da246a9155b89d3.png"},"_timestamp":1.767001316258984e+09,"step_loss":11.12689208984375,"avg_masking_rate":0.9541797637939453,"_wandb":{"runtime":421},"_step":420,"lr":0.0006000000000000001,"_runtime":421.494331581}
\ No newline at end of file
diff --git a/Meissonic/wandb/run-20251229_093500-yyrdgepk/run-yyrdgepk.wandb b/Meissonic/wandb/run-20251229_093500-yyrdgepk/run-yyrdgepk.wandb
new file mode 100644
index 0000000000000000000000000000000000000000..e04445ee3dd256075d5ca6ca9916bc6d4cd4e5e2
--- /dev/null
+++ b/Meissonic/wandb/run-20251229_093500-yyrdgepk/run-yyrdgepk.wandb
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:71ac9f3708d9bf33df07f5adffcc7a734cb4125d331d2cc0cbeda23bdf521b7d
+size 259129
diff --git a/OpenVid1M_reorganized.csv b/OpenVid1M_reorganized.csv
new file mode 100644
index 0000000000000000000000000000000000000000..adbe7b345ff587bc33654c1588e6650890e78942
--- /dev/null
+++ b/OpenVid1M_reorganized.csv
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b5eb9daff91e74f994f9a271daff6edb825bcfb60e29977334df4421d5fd560c
+size 854237904
diff --git a/Wan2.1-T2V-1.3B/.gitattributes b/Wan2.1-T2V-1.3B/.gitattributes
new file mode 100644
index 0000000000000000000000000000000000000000..0a1f66aea84e17cbf6a60c51431723062f87df8a
--- /dev/null
+++ b/Wan2.1-T2V-1.3B/.gitattributes
@@ -0,0 +1,47 @@
+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+google/umt5-xxl/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+assets/comp_effic.png filter=lfs diff=lfs merge=lfs -text
+assets/data_for_diff_stage.jpg filter=lfs diff=lfs merge=lfs -text
+assets/i2v_res.png filter=lfs diff=lfs merge=lfs -text
+assets/logo.png filter=lfs diff=lfs merge=lfs -text
+assets/t2v_res.jpg filter=lfs diff=lfs merge=lfs -text
+assets/vben_vs_sota.png filter=lfs diff=lfs merge=lfs -text
+assets/vben_vs_sota_t2i.jpg filter=lfs diff=lfs merge=lfs -text
+assets/video_dit_arch.jpg filter=lfs diff=lfs merge=lfs -text
+assets/video_vae_res.jpg filter=lfs diff=lfs merge=lfs -text
+examples/i2v_input.JPG filter=lfs diff=lfs merge=lfs -text
+assets/.DS_Store filter=lfs diff=lfs merge=lfs -text
diff --git a/Wan2.1-T2V-1.3B/LICENSE.txt b/Wan2.1-T2V-1.3B/LICENSE.txt
new file mode 100644
index 0000000000000000000000000000000000000000..261eeb9e9f8b2b4b0d119366dda99c6fd7d35c64
--- /dev/null
+++ b/Wan2.1-T2V-1.3B/LICENSE.txt
@@ -0,0 +1,201 @@
+ Apache License
+ Version 2.0, January 2004
+ http://www.apache.org/licenses/
+
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+ 1. Definitions.
+
+ "License" shall mean the terms and conditions for use, reproduction,
+ and distribution as defined by Sections 1 through 9 of this document.
+
+ "Licensor" shall mean the copyright owner or entity authorized by
+ the copyright owner that is granting the License.
+
+ "Legal Entity" shall mean the union of the acting entity and all
+ other entities that control, are controlled by, or are under common
+ control with that entity. For the purposes of this definition,
+ "control" means (i) the power, direct or indirect, to cause the
+ direction or management of such entity, whether by contract or
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
+ outstanding shares, or (iii) beneficial ownership of such entity.
+
+ "You" (or "Your") shall mean an individual or Legal Entity
+ exercising permissions granted by this License.
+
+ "Source" form shall mean the preferred form for making modifications,
+ including but not limited to software source code, documentation
+ source, and configuration files.
+
+ "Object" form shall mean any form resulting from mechanical
+ transformation or translation of a Source form, including but
+ not limited to compiled object code, generated documentation,
+ and conversions to other media types.
+
+ "Work" shall mean the work of authorship, whether in Source or
+ Object form, made available under the License, as indicated by a
+ copyright notice that is included in or attached to the work
+ (an example is provided in the Appendix below).
+
+ "Derivative Works" shall mean any work, whether in Source or Object
+ form, that is based on (or derived from) the Work and for which the
+ editorial revisions, annotations, elaborations, or other modifications
+ represent, as a whole, an original work of authorship. For the purposes
+ of this License, Derivative Works shall not include works that remain
+ separable from, or merely link (or bind by name) to the interfaces of,
+ the Work and Derivative Works thereof.
+
+ "Contribution" shall mean any work of authorship, including
+ the original version of the Work and any modifications or additions
+ to that Work or Derivative Works thereof, that is intentionally
+ submitted to Licensor for inclusion in the Work by the copyright owner
+ or by an individual or Legal Entity authorized to submit on behalf of
+ the copyright owner. For the purposes of this definition, "submitted"
+ means any form of electronic, verbal, or written communication sent
+ to the Licensor or its representatives, including but not limited to
+ communication on electronic mailing lists, source code control systems,
+ and issue tracking systems that are managed by, or on behalf of, the
+ Licensor for the purpose of discussing and improving the Work, but
+ excluding communication that is conspicuously marked or otherwise
+ designated in writing by the copyright owner as "Not a Contribution."
+
+ "Contributor" shall mean Licensor and any individual or Legal Entity
+ on behalf of whom a Contribution has been received by Licensor and
+ subsequently incorporated within the Work.
+
+ 2. Grant of Copyright License. Subject to the terms and conditions of
+ this License, each Contributor hereby grants to You a perpetual,
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+ copyright license to reproduce, prepare Derivative Works of,
+ publicly display, publicly perform, sublicense, and distribute the
+ Work and such Derivative Works in Source or Object form.
+
+ 3. Grant of Patent License. Subject to the terms and conditions of
+ this License, each Contributor hereby grants to You a perpetual,
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+ (except as stated in this section) patent license to make, have made,
+ use, offer to sell, sell, import, and otherwise transfer the Work,
+ where such license applies only to those patent claims licensable
+ by such Contributor that are necessarily infringed by their
+ Contribution(s) alone or by combination of their Contribution(s)
+ with the Work to which such Contribution(s) was submitted. If You
+ institute patent litigation against any entity (including a
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
+ or a Contribution incorporated within the Work constitutes direct
+ or contributory patent infringement, then any patent licenses
+ granted to You under this License for that Work shall terminate
+ as of the date such litigation is filed.
+
+ 4. Redistribution. You may reproduce and distribute copies of the
+ Work or Derivative Works thereof in any medium, with or without
+ modifications, and in Source or Object form, provided that You
+ meet the following conditions:
+
+ (a) You must give any other recipients of the Work or
+ Derivative Works a copy of this License; and
+
+ (b) You must cause any modified files to carry prominent notices
+ stating that You changed the files; and
+
+ (c) You must retain, in the Source form of any Derivative Works
+ that You distribute, all copyright, patent, trademark, and
+ attribution notices from the Source form of the Work,
+ excluding those notices that do not pertain to any part of
+ the Derivative Works; and
+
+ (d) If the Work includes a "NOTICE" text file as part of its
+ distribution, then any Derivative Works that You distribute must
+ include a readable copy of the attribution notices contained
+ within such NOTICE file, excluding those notices that do not
+ pertain to any part of the Derivative Works, in at least one
+ of the following places: within a NOTICE text file distributed
+ as part of the Derivative Works; within the Source form or
+ documentation, if provided along with the Derivative Works; or,
+ within a display generated by the Derivative Works, if and
+ wherever such third-party notices normally appear. The contents
+ of the NOTICE file are for informational purposes only and
+ do not modify the License. You may add Your own attribution
+ notices within Derivative Works that You distribute, alongside
+ or as an addendum to the NOTICE text from the Work, provided
+ that such additional attribution notices cannot be construed
+ as modifying the License.
+
+ You may add Your own copyright statement to Your modifications and
+ may provide additional or different license terms and conditions
+ for use, reproduction, or distribution of Your modifications, or
+ for any such Derivative Works as a whole, provided Your use,
+ reproduction, and distribution of the Work otherwise complies with
+ the conditions stated in this License.
+
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
+ any Contribution intentionally submitted for inclusion in the Work
+ by You to the Licensor shall be under the terms and conditions of
+ this License, without any additional terms or conditions.
+ Notwithstanding the above, nothing herein shall supersede or modify
+ the terms of any separate license agreement you may have executed
+ with Licensor regarding such Contributions.
+
+ 6. Trademarks. This License does not grant permission to use the trade
+ names, trademarks, service marks, or product names of the Licensor,
+ except as required for reasonable and customary use in describing the
+ origin of the Work and reproducing the content of the NOTICE file.
+
+ 7. Disclaimer of Warranty. Unless required by applicable law or
+ agreed to in writing, Licensor provides the Work (and each
+ Contributor provides its Contributions) on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied, including, without limitation, any warranties or conditions
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+ PARTICULAR PURPOSE. You are solely responsible for determining the
+ appropriateness of using or redistributing the Work and assume any
+ risks associated with Your exercise of permissions under this License.
+
+ 8. Limitation of Liability. In no event and under no legal theory,
+ whether in tort (including negligence), contract, or otherwise,
+ unless required by applicable law (such as deliberate and grossly
+ negligent acts) or agreed to in writing, shall any Contributor be
+ liable to You for damages, including any direct, indirect, special,
+ incidental, or consequential damages of any character arising as a
+ result of this License or out of the use or inability to use the
+ Work (including but not limited to damages for loss of goodwill,
+ work stoppage, computer failure or malfunction, or any and all
+ other commercial damages or losses), even if such Contributor
+ has been advised of the possibility of such damages.
+
+ 9. Accepting Warranty or Additional Liability. While redistributing
+ the Work or Derivative Works thereof, You may choose to offer,
+ and charge a fee for, acceptance of support, warranty, indemnity,
+ or other liability obligations and/or rights consistent with this
+ License. However, in accepting such obligations, You may act only
+ on Your own behalf and on Your sole responsibility, not on behalf
+ of any other Contributor, and only if You agree to indemnify,
+ defend, and hold each Contributor harmless for any liability
+ incurred by, or claims asserted against, such Contributor by reason
+ of your accepting any such warranty or additional liability.
+
+ END OF TERMS AND CONDITIONS
+
+ APPENDIX: How to apply the Apache License to your work.
+
+ To apply the Apache License to your work, attach the following
+ boilerplate notice, with the fields enclosed by brackets "[]"
+ replaced with your own identifying information. (Don't include
+ the brackets!) The text should be enclosed in the appropriate
+ comment syntax for the file format. We also recommend that a
+ file or class name and description of purpose be included on the
+ same "printed page" as the copyright notice for easier
+ identification within third-party archives.
+
+ Copyright [yyyy] [name of copyright owner]
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
diff --git a/Wan2.1-T2V-1.3B/README.md b/Wan2.1-T2V-1.3B/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..fbbf7c13900d37f3179c546d09372aeb52e0fae1
--- /dev/null
+++ b/Wan2.1-T2V-1.3B/README.md
@@ -0,0 +1,298 @@
+---
+license: apache-2.0
+language:
+- en
+- zh
+pipeline_tag: text-to-video
+library_name: diffusers
+tags:
+- video
+- video-generation
+---
+# Wan2.1
+
+
+
+
+
+
+ 💜 Wan    |    🖥️ GitHub    |   🤗 Hugging Face   |   🤖 ModelScope   |    📑 Paper (Coming soon)    |    📑 Blog    |   💬 WeChat Group   |    📖 Discord  
+
+
+-----
+
+[**Wan: Open and Advanced Large-Scale Video Generative Models**]("#")
+
+In this repository, we present **Wan2.1**, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation. **Wan2.1** offers these key features:
+- 👍 **SOTA Performance**: **Wan2.1** consistently outperforms existing open-source models and state-of-the-art commercial solutions across multiple benchmarks.
+- 👍 **Supports Consumer-grade GPUs**: The T2V-1.3B model requires only 8.19 GB VRAM, making it compatible with almost all consumer-grade GPUs. It can generate a 5-second 480P video on an RTX 4090 in about 4 minutes (without optimization techniques like quantization). Its performance is even comparable to some closed-source models.
+- 👍 **Multiple Tasks**: **Wan2.1** excels in Text-to-Video, Image-to-Video, Video Editing, Text-to-Image, and Video-to-Audio, advancing the field of video generation.
+- 👍 **Visual Text Generation**: **Wan2.1** is the first video model capable of generating both Chinese and English text, featuring robust text generation that enhances its practical applications.
+- 👍 **Powerful Video VAE**: **Wan-VAE** delivers exceptional efficiency and performance, encoding and decoding 1080P videos of any length while preserving temporal information, making it an ideal foundation for video and image generation.
+
+
+This repository hosts our T2V-1.3B model, a versatile solution for video generation that is compatible with nearly all consumer-grade GPUs. In this way, we hope that **Wan2.1** can serve as an easy-to-use tool for more creative teams in video creation, providing a high-quality foundational model for academic teams with limited computing resources. This will facilitate both the rapid development of the video creation community and the swift advancement of video technology.
+
+
+## Video Demos
+
+
+
+
+
+
+## 🔥 Latest News!!
+
+* Feb 25, 2025: 👋 We've released the inference code and weights of Wan2.1.
+
+
+## 📑 Todo List
+- Wan2.1 Text-to-Video
+ - [x] Multi-GPU Inference code of the 14B and 1.3B models
+ - [x] Checkpoints of the 14B and 1.3B models
+ - [x] Gradio demo
+ - [ ] Diffusers integration
+ - [ ] ComfyUI integration
+- Wan2.1 Image-to-Video
+ - [x] Multi-GPU Inference code of the 14B model
+ - [x] Checkpoints of the 14B model
+ - [x] Gradio demo
+ - [ ] Diffusers integration
+ - [ ] ComfyUI integration
+
+
+## Quickstart
+
+#### Installation
+Clone the repo:
+```
+git clone https://github.com/Wan-Video/Wan2.1.git
+cd Wan2.1
+```
+
+Install dependencies:
+```
+# Ensure torch >= 2.4.0
+pip install -r requirements.txt
+```
+
+
+#### Model Download
+
+| Models | Download Link | Notes |
+| --------------|-------------------------------------------------------------------------------|-------------------------------|
+| T2V-14B | 🤗 [Huggingface](https://huggingface.co/Wan-AI/Wan2.1-T2V-14B) 🤖 [ModelScope](https://www.modelscope.cn/models/Wan-AI/Wan2.1-T2V-14B) | Supports both 480P and 720P
+| I2V-14B-720P | 🤗 [Huggingface](https://huggingface.co/Wan-AI/Wan2.1-I2V-14B-720P) 🤖 [ModelScope](https://www.modelscope.cn/models/Wan-AI/Wan2.1-I2V-14B-720P) | Supports 720P
+| I2V-14B-480P | 🤗 [Huggingface](https://huggingface.co/Wan-AI/Wan2.1-I2V-14B-480P) 🤖 [ModelScope](https://www.modelscope.cn/models/Wan-AI/Wan2.1-I2V-14B-480P) | Supports 480P
+| T2V-1.3B | 🤗 [Huggingface](https://huggingface.co/Wan-AI/Wan2.1-T2V-1.3B) 🤖 [ModelScope](https://www.modelscope.cn/models/Wan-AI/Wan2.1-T2V-1.3B) | Supports 480P
+
+
+> 💡Note: The 1.3B model is capable of generating videos at 720P resolution. However, due to limited training at this resolution, the results are generally less stable compared to 480P. For optimal performance, we recommend using 480P resolution.
+
+
+Download models using 🤗 huggingface-cli:
+```
+pip install "huggingface_hub[cli]"
+huggingface-cli download Wan-AI/Wan2.1-T2V-1.3B --local-dir ./Wan2.1-T2V-1.3B
+```
+
+Download models using 🤖 modelscope-cli:
+```
+pip install modelscope
+modelscope download Wan-AI/Wan2.1-T2V-1.3B --local_dir ./Wan2.1-T2V-1.3B
+```
+
+#### Run Text-to-Video Generation
+
+This repository supports two Text-to-Video models (1.3B and 14B) and two resolutions (480P and 720P). The parameters and configurations for these models are as follows:
+
+
+
+
+
Task
+
Resolution
+
Model
+
+
+
480P
+
720P
+
+
+
+
+
t2v-14B
+
✔️
+
✔️
+
Wan2.1-T2V-14B
+
+
+
t2v-1.3B
+
✔️
+
❌
+
Wan2.1-T2V-1.3B
+
+
+
+
+
+##### (1) Without Prompt Extention
+
+To facilitate implementation, we will start with a basic version of the inference process that skips the [prompt extension](#2-using-prompt-extention) step.
+
+- Single-GPU inference
+
+```
+python generate.py --task t2v-1.3B --size 832*480 --ckpt_dir ./Wan2.1-T2V-1.3B --sample_shift 8 --sample_guide_scale 6 --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."
+```
+
+If you encounter OOM (Out-of-Memory) issues, you can use the `--offload_model True` and `--t5_cpu` options to reduce GPU memory usage. For example, on an RTX 4090 GPU:
+
+```
+python generate.py --task t2v-1.3B --size 832*480 --ckpt_dir ./Wan2.1-T2V-1.3B --offload_model True --t5_cpu --sample_shift 8 --sample_guide_scale 6 --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."
+```
+
+> 💡Note: If you are using the `T2V-1.3B` model, we recommend setting the parameter `--sample_guide_scale 6`. The `--sample_shift parameter` can be adjusted within the range of 8 to 12 based on the performance.
+
+- Multi-GPU inference using FSDP + xDiT USP
+
+```
+pip install "xfuser>=0.4.1"
+torchrun --nproc_per_node=8 generate.py --task t2v-1.3B --size 832*480 --ckpt_dir ./Wan2.1-T2V-1.3B --dit_fsdp --t5_fsdp --ulysses_size 8 --sample_shift 8 --sample_guide_scale 6 --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."
+```
+
+
+##### (2) Using Prompt Extention
+
+Extending the prompts can effectively enrich the details in the generated videos, further enhancing the video quality. Therefore, we recommend enabling prompt extension. We provide the following two methods for prompt extension:
+
+- Use the Dashscope API for extension.
+ - Apply for a `dashscope.api_key` in advance ([EN](https://www.alibabacloud.com/help/en/model-studio/getting-started/first-api-call-to-qwen) | [CN](https://help.aliyun.com/zh/model-studio/getting-started/first-api-call-to-qwen)).
+ - Configure the environment variable `DASH_API_KEY` to specify the Dashscope API key. For users of Alibaba Cloud's international site, you also need to set the environment variable `DASH_API_URL` to 'https://dashscope-intl.aliyuncs.com/api/v1'. For more detailed instructions, please refer to the [dashscope document](https://www.alibabacloud.com/help/en/model-studio/developer-reference/use-qwen-by-calling-api?spm=a2c63.p38356.0.i1).
+ - Use the `qwen-plus` model for text-to-video tasks and `qwen-vl-max` for image-to-video tasks.
+ - You can modify the model used for extension with the parameter `--prompt_extend_model`. For example:
+```
+DASH_API_KEY=your_key python generate.py --task t2v-1.3B --size 832*480 --ckpt_dir ./Wan2.1-T2V-1.3B --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage" --use_prompt_extend --prompt_extend_method 'dashscope' --prompt_extend_target_lang 'ch'
+```
+
+- Using a local model for extension.
+
+ - By default, the Qwen model on HuggingFace is used for this extension. Users can choose based on the available GPU memory size.
+ - For text-to-video tasks, you can use models like `Qwen/Qwen2.5-14B-Instruct`, `Qwen/Qwen2.5-7B-Instruct` and `Qwen/Qwen2.5-3B-Instruct`
+ - For image-to-video tasks, you can use models like `Qwen/Qwen2.5-VL-7B-Instruct` and `Qwen/Qwen2.5-VL-3B-Instruct`.
+ - Larger models generally provide better extension results but require more GPU memory.
+ - You can modify the model used for extension with the parameter `--prompt_extend_model` , allowing you to specify either a local model path or a Hugging Face model. For example:
+
+```
+python generate.py --task t2v-1.3B --size 832*480 --ckpt_dir ./Wan2.1-T2V-1.3B --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage" --use_prompt_extend --prompt_extend_method 'local_qwen' --prompt_extend_target_lang 'ch'
+```
+
+##### (3) Runing local gradio
+
+```
+cd gradio
+# if one uses dashscope’s API for prompt extension
+DASH_API_KEY=your_key python t2v_1.3B_singleGPU.py --prompt_extend_method 'dashscope' --ckpt_dir ./Wan2.1-T2V-1.3B
+
+# if one uses a local model for prompt extension
+python t2v_1.3B_singleGPU.py --prompt_extend_method 'local_qwen' --ckpt_dir ./Wan2.1-T2V-1.3B
+```
+
+
+
+## Evaluation
+
+We employ our **Wan-Bench** framework to evaluate the performance of the T2V-1.3B model, with the results displayed in the table below. The results indicate that our smaller 1.3B model surpasses the overall metrics of larger open-source models, demonstrating the effectiveness of **WanX2.1**'s architecture and the data construction pipeline.
+
+
+
+
+
+
+
+## Computational Efficiency on Different GPUs
+
+We test the computational efficiency of different **Wan2.1** models on different GPUs in the following table. The results are presented in the format: **Total time (s) / peak GPU memory (GB)**.
+
+
+
+
+
+
+> The parameter settings for the tests presented in this table are as follows:
+> (1) For the 1.3B model on 8 GPUs, set `--ring_size 8` and `--ulysses_size 1`;
+> (2) For the 14B model on 1 GPU, use `--offload_model True`;
+> (3) For the 1.3B model on a single 4090 GPU, set `--offload_model True --t5_cpu`;
+> (4) For all testings, no prompt extension was applied, meaning `--use_prompt_extend` was not enabled.
+
+-------
+
+## Introduction of Wan2.1
+
+**Wan2.1** is designed on the mainstream diffusion transformer paradigm, achieving significant advancements in generative capabilities through a series of innovations. These include our novel spatio-temporal variational autoencoder (VAE), scalable training strategies, large-scale data construction, and automated evaluation metrics. Collectively, these contributions enhance the model’s performance and versatility.
+
+
+##### (1) 3D Variational Autoencoders
+We propose a novel 3D causal VAE architecture, termed **Wan-VAE** specifically designed for video generation. By combining multiple strategies, we improve spatio-temporal compression, reduce memory usage, and ensure temporal causality. **Wan-VAE** demonstrates significant advantages in performance efficiency compared to other open-source VAEs. Furthermore, our **Wan-VAE** can encode and decode unlimited-length 1080P videos without losing historical temporal information, making it particularly well-suited for video generation tasks.
+
+
+
+
+
+
+
+##### (2) Video Diffusion DiT
+
+**Wan2.1** is designed using the Flow Matching framework within the paradigm of mainstream Diffusion Transformers. Our model's architecture uses the T5 Encoder to encode multilingual text input, with cross-attention in each transformer block embedding the text into the model structure. Additionally, we employ an MLP with a Linear layer and a SiLU layer to process the input time embeddings and predict six modulation parameters individually. This MLP is shared across all transformer blocks, with each block learning a distinct set of biases. Our experimental findings reveal a significant performance improvement with this approach at the same parameter scale.
+
+
+
+
+
+
+| Model | Dimension | Input Dimension | Output Dimension | Feedforward Dimension | Frequency Dimension | Number of Heads | Number of Layers |
+|--------|-----------|-----------------|------------------|-----------------------|---------------------|-----------------|------------------|
+| 1.3B | 1536 | 16 | 16 | 8960 | 256 | 12 | 30 |
+| 14B | 5120 | 16 | 16 | 13824 | 256 | 40 | 40 |
+
+
+
+##### Data
+
+We curated and deduplicated a candidate dataset comprising a vast amount of image and video data. During the data curation process, we designed a four-step data cleaning process, focusing on fundamental dimensions, visual quality and motion quality. Through the robust data processing pipeline, we can easily obtain high-quality, diverse, and large-scale training sets of images and videos.
+
+
+
+
+##### Comparisons to SOTA
+We compared **Wan2.1** with leading open-source and closed-source models to evaluate the performace. Using our carefully designed set of 1,035 internal prompts, we tested across 14 major dimensions and 26 sub-dimensions. Then we calculated the total score through a weighted average based on the importance of each dimension. The detailed results are shown in the table below. These results demonstrate our model's superior performance compared to both open-source and closed-source models.
+
+
+
+
+## Citation
+If you find our work helpful, please cite us.
+
+```
+@article{wan2.1,
+ title = {Wan: Open and Advanced Large-Scale Video Generative Models},
+ author = {Wan Team},
+ journal = {},
+ year = {2025}
+}
+```
+
+## License Agreement
+The models in this repository are licensed under the Apache 2.0 License. We claim no rights over the your generate contents, granting you the freedom to use them while ensuring that your usage complies with the provisions of this license. You are fully accountable for your use of the models, which must not involve sharing any content that violates applicable laws, causes harm to individuals or groups, disseminates personal information intended for harm, spreads misinformation, or targets vulnerable populations. For a complete list of restrictions and details regarding your rights, please refer to the full text of the [license](LICENSE.txt).
+
+
+## Acknowledgements
+
+We would like to thank the contributors to the [SD3](https://huggingface.co/stabilityai/stable-diffusion-3-medium), [Qwen](https://huggingface.co/Qwen), [umt5-xxl](https://huggingface.co/google/umt5-xxl), [diffusers](https://github.com/huggingface/diffusers) and [HuggingFace](https://huggingface.co) repositories, for their open research.
+
+
+
+## Contact Us
+If you would like to leave a message to our research or product teams, feel free to join our [Discord](https://discord.gg/p5XbdQV7) or [WeChat groups](https://gw.alicdn.com/imgextra/i2/O1CN01tqjWFi1ByuyehkTSB_!!6000000000015-0-tps-611-1279.jpg)!
\ No newline at end of file
diff --git a/Wan2.1-T2V-1.3B/config.json b/Wan2.1-T2V-1.3B/config.json
new file mode 100644
index 0000000000000000000000000000000000000000..d203bef600b2f3c64fe1f5f53d70a2087f4ccd2f
--- /dev/null
+++ b/Wan2.1-T2V-1.3B/config.json
@@ -0,0 +1,14 @@
+{
+ "_class_name": "WanModel",
+ "_diffusers_version": "0.30.0",
+ "dim": 1536,
+ "eps": 1e-06,
+ "ffn_dim": 8960,
+ "freq_dim": 256,
+ "in_dim": 16,
+ "model_type": "t2v",
+ "num_heads": 12,
+ "num_layers": 30,
+ "out_dim": 16,
+ "text_len": 512
+}
diff --git a/Wan2.1-T2V-1.3B/examples/i2v_input.JPG b/Wan2.1-T2V-1.3B/examples/i2v_input.JPG
new file mode 100644
index 0000000000000000000000000000000000000000..8c7fabd943752179587eb717362db32ce1eb4800
--- /dev/null
+++ b/Wan2.1-T2V-1.3B/examples/i2v_input.JPG
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:077e3d965090c9028c69c00931675f42e1acc815c6eb450ab291b3b72d211a8e
+size 250628
diff --git a/Wan2.1-T2V-1.3B/google/umt5-xxl/special_tokens_map.json b/Wan2.1-T2V-1.3B/google/umt5-xxl/special_tokens_map.json
new file mode 100644
index 0000000000000000000000000000000000000000..14855e7052ffbb595057dfd791d293c1c940db2c
--- /dev/null
+++ b/Wan2.1-T2V-1.3B/google/umt5-xxl/special_tokens_map.json
@@ -0,0 +1,308 @@
+{
+ "additional_special_tokens": [
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "",
+ "