alpercagann commited on
Commit
dc72d06
·
1 Parent(s): 3747436

Add torch to requirements

Browse files
DIFFUSERS_COMPATIBILITY.md ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Diffusers Compatibility Issues
2
+
3
+ ## Overview
4
+ This document outlines compatibility issues with the SonicDiffusion project and the diffusers library version 0.21.4.
5
+
6
+ ## Identified Issues
7
+
8
+ The project requires components from newer versions of diffusers that are not available in 0.21.4, including:
9
+
10
+ 1. `IPAdapterMixin` in `diffusers.loaders`
11
+ 2. `FromSingleFileMixin` in `diffusers.loaders`
12
+ 3. `PeftAdapterMixin` in `diffusers.loaders`
13
+ 4. `USE_PEFT_BACKEND` in `diffusers.utils`
14
+ 5. `apply_freeu` in `diffusers.utils.torch_utils`
15
+ 6. `AdaGroupNorm` in `diffusers.models.normalization`
16
+ 7. `ResnetBlockCondNorm2D` in `diffusers.models.resnet`
17
+ 8. `DualTransformer2DModel` in `diffusers.models.transformers.dual_transformer_2d`
18
+ 9. `GEGLU`, `GELU`, `ApproximateGELU` in `diffusers.models.activations`
19
+ 10. `ImagePositionalEmbeddings`, `PatchEmbed`, `PixArtAlphaTextProjection` in `diffusers.models.embeddings`
20
+ 11. `AdaLayerNormSingle` in `diffusers.models.normalization`
21
+ 12. `StableDiffusionMixin` in `diffusers.pipelines.pipeline_utils`
22
+
23
+ ## Solutions
24
+
25
+ We've implemented several fixes for compatibility:
26
+
27
+ 1. Added dummy implementations for missing classes
28
+ 2. Added fallback imports with try/except blocks
29
+ 3. Simplified implementations of complex components
30
+ 4. Worked around limitations in the older diffusers API
31
+
32
+ ## Recommended Approach
33
+
34
+ For a more reliable fix, you should:
35
+
36
+ 1. **Update diffusers**: Upgrade to a newer version (we recommend at least 0.25.0)
37
+ ```bash
38
+ pip install 'diffusers>=0.25.0'
39
+ ```
40
+
41
+ 2. **Update related packages**: Ensure complementary packages are also updated
42
+ ```bash
43
+ pip install 'transformers>=4.36.0' 'accelerate>=0.25.0'
44
+ ```
45
+
46
+ 3. **Alternative approach**: If you cannot update diffusers, try using a standalone version without integration with HuggingFace:
47
+
48
+ - Modify the controller.py to use explicit PyTorch components without requiring diffusers for direct audio-to-image conversion
49
+ - Use a pre-trained model with your own implementation of the pipeline
50
+
51
+ ## Error Handling for Gradio
52
+
53
+ There are also issues with Gradio compatibility. The simplest solution is:
54
+
55
+ ```bash
56
+ pip install 'gradio>=4.19.0,<4.27.0'
57
+ ```
58
+
59
+ When running the app, use:
60
+
61
+ ```python
62
+ demo.launch(server_name="0.0.0.0", share=True)
63
+ ```
64
+
65
+ This helps prevent the localhost access error and creates a shareable link.
Dockerfile ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.10-slim
2
+
3
+ WORKDIR /app
4
+
5
+ # Install system dependencies
6
+ RUN apt-get update && apt-get install -y \
7
+ build-essential \
8
+ git \
9
+ ffmpeg \
10
+ libsndfile1 \
11
+ && rm -rf /var/lib/apt/lists/*
12
+
13
+ # Install Python dependencies with pinned versions
14
+ COPY requirements.txt .
15
+ RUN pip install --no-cache-dir -r requirements.txt
16
+
17
+ # Copy application code
18
+ COPY . .
19
+
20
+ # Create necessary directories
21
+ RUN mkdir -p assets ckpts outputs
22
+
23
+ # Expose port for Gradio
24
+ EXPOSE 7860
25
+
26
+ # Command to run the application
27
+ CMD ["python", "app.py"]
app.py CHANGED
@@ -1,6 +1,12 @@
1
  import os
2
  import sys
3
 
 
 
 
 
 
 
4
  # Print environment information
5
  print("==== Environment Information ====")
6
  print(f"Python version: {sys.version}")
@@ -181,4 +187,5 @@ with gr.Blocks(title="SonicDiffusion") as demo:
181
  )
182
 
183
  if __name__ == "__main__":
184
- demo.launch()
 
 
1
  import os
2
  import sys
3
 
4
+ # Apply compatibility patches first
5
+ try:
6
+ import compatibility_patches
7
+ except ImportError:
8
+ print("Warning: compatibility_patches not found")
9
+
10
  # Print environment information
11
  print("==== Environment Information ====")
12
  print(f"Python version: {sys.version}")
 
187
  )
188
 
189
  if __name__ == "__main__":
190
+ # Change the server parameters
191
+ demo.launch(server_name="0.0.0.0", share=True)
attention_custom.py CHANGED
@@ -1,17 +1,142 @@
1
  # Adapted from https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/attention.py
2
 
3
  from typing import Any, Dict, Optional
 
4
 
5
  import torch
6
  import torch.nn.functional as F
7
  from torch import nn
8
 
9
  from diffusers.utils import deprecate, logging
10
- from diffusers.utils.torch_utils import maybe_allow_in_graph
11
- from diffusers.models.activations import GEGLU, GELU, ApproximateGELU
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  from diffusers.models.attention_processor import Attention
13
- from diffusers.models.embeddings import SinusoidalPositionalEmbedding
14
- from diffusers.models.normalization import AdaLayerNorm, AdaLayerNormContinuous, AdaLayerNormZero, RMSNorm
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
 
17
  logger = logging.get_logger(__name__)
 
1
  # Adapted from https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/attention.py
2
 
3
  from typing import Any, Dict, Optional
4
+ import math
5
 
6
  import torch
7
  import torch.nn.functional as F
8
  from torch import nn
9
 
10
  from diffusers.utils import deprecate, logging
11
+
12
+ # Import maybe_allow_in_graph or define if not available
13
+ try:
14
+ from diffusers.utils.torch_utils import maybe_allow_in_graph
15
+ except ImportError:
16
+ def maybe_allow_in_graph(fn):
17
+ """Dummy decorator for compatibility with older diffusers versions"""
18
+ return fn
19
+
20
+ # Define activation functions since they're not available in this version of diffusers
21
+ # GELU activation
22
+ class GELU(nn.Module):
23
+ """
24
+ Custom implementation of GELU activation for compatibility with older diffusers versions.
25
+ See https://arxiv.org/abs/1606.08415 for details.
26
+ """
27
+ def forward(self, input):
28
+ return F.gelu(input)
29
+
30
+ # Approximate GELU
31
+ class ApproximateGELU(nn.Module):
32
+ """
33
+ Custom implementation of Approximate GELU activation for compatibility with older diffusers versions.
34
+ """
35
+ def forward(self, input):
36
+ return 0.5 * input * (1 + torch.tanh(math.sqrt(2 / math.pi) * (input + 0.044715 * torch.pow(input, 3))))
37
+
38
+ # GEGLU activation
39
+ class GEGLU(nn.Module):
40
+ """
41
+ Custom implementation of GEGLU activation for compatibility with older diffusers versions.
42
+ See https://arxiv.org/abs/2002.05202 for more details.
43
+ """
44
+ def __init__(self, dim_in, dim_out):
45
+ super().__init__()
46
+ self.proj = nn.Linear(dim_in, dim_out * 2)
47
+ self.dim_out = dim_out
48
+
49
+ def forward(self, hidden_states):
50
+ hidden_states, gate = self.proj(hidden_states).chunk(2, dim=-1)
51
+ return hidden_states * F.gelu(gate)
52
  from diffusers.models.attention_processor import Attention
53
+
54
+ # Import embeddings with fallbacks
55
+ try:
56
+ from diffusers.models.embeddings import SinusoidalPositionalEmbedding
57
+ except ImportError:
58
+ # Define a simple SinusoidalPositionalEmbedding
59
+ class SinusoidalPositionalEmbedding(nn.Module):
60
+ """
61
+ Custom implementation of SinusoidalPositionalEmbedding for compatibility with older diffusers versions.
62
+ """
63
+ def __init__(self, dim, max_seq_length=5000):
64
+ super().__init__()
65
+ self.dim = dim
66
+ self.max_seq_length = max_seq_length
67
+
68
+ def forward(self, seq_length):
69
+ position = torch.arange(seq_length, device=seq_length.device)
70
+ dim_t = torch.arange(self.dim // 2, device=seq_length.device)
71
+ dim_t = 10000 ** (2 * (dim_t) / self.dim)
72
+
73
+ x = position[:, None] / dim_t[None, :]
74
+ embeddings = torch.cat((torch.sin(x), torch.cos(x)), dim=1)
75
+
76
+ if self.dim % 2 == 1: # if odd, add zero padding
77
+ embeddings = torch.cat((embeddings, torch.zeros_like(embeddings[:, :1])), dim=1)
78
+
79
+ return embeddings.to(seq_length.device)
80
+
81
+ # Import normalization layers with fallbacks
82
+ try:
83
+ from diffusers.models.normalization import AdaLayerNorm, AdaLayerNormContinuous, AdaLayerNormZero, RMSNorm
84
+ except ImportError:
85
+ # Define simple versions for compatibility
86
+ class AdaLayerNorm(nn.Module):
87
+ """
88
+ Custom implementation of AdaLayerNorm for compatibility with older diffusers versions.
89
+ """
90
+ def __init__(self, embedding_dim, num_embeddings=None):
91
+ super().__init__()
92
+ self.emb = nn.Linear(embedding_dim, embedding_dim * 2)
93
+ self.norm = nn.LayerNorm(embedding_dim, elementwise_affine=False)
94
+
95
+ def forward(self, x, emb):
96
+ shift, scale = self.emb(emb).chunk(2, dim=1)
97
+ x = self.norm(x)
98
+ return x * (1 + scale.unsqueeze(1)) + shift.unsqueeze(1)
99
+
100
+ class AdaLayerNormContinuous(nn.Module):
101
+ """
102
+ Custom implementation of AdaLayerNormContinuous for compatibility with older diffusers versions.
103
+ """
104
+ def __init__(self, embedding_dim):
105
+ super().__init__()
106
+ self.emb = nn.Linear(embedding_dim, embedding_dim * 2)
107
+ self.norm = nn.LayerNorm(embedding_dim, elementwise_affine=False)
108
+
109
+ def forward(self, x, emb):
110
+ shift, scale = self.emb(emb).chunk(2, dim=1)
111
+ x = self.norm(x)
112
+ return x * (1 + scale.unsqueeze(1)) + shift.unsqueeze(1)
113
+
114
+ class AdaLayerNormZero(nn.Module):
115
+ """
116
+ Custom implementation of AdaLayerNormZero for compatibility with older diffusers versions.
117
+ """
118
+ def __init__(self, embedding_dim):
119
+ super().__init__()
120
+ self.emb = nn.Linear(embedding_dim, embedding_dim * 2)
121
+ self.norm = nn.LayerNorm(embedding_dim, elementwise_affine=False)
122
+
123
+ def forward(self, x, emb):
124
+ shift, scale = self.emb(emb).chunk(2, dim=1)
125
+ x = self.norm(x)
126
+ return x * (1 + scale.unsqueeze(1)) + shift.unsqueeze(1)
127
+
128
+ class RMSNorm(nn.Module):
129
+ """
130
+ Custom implementation of RMSNorm for compatibility with older diffusers versions.
131
+ """
132
+ def __init__(self, dim, eps=1e-6):
133
+ super().__init__()
134
+ self.scale = dim ** 0.5
135
+ self.eps = eps
136
+ self.g = nn.Parameter(torch.ones(dim))
137
+
138
+ def forward(self, x):
139
+ return x * self.g / torch.norm(x, dim=-1, keepdim=True).clamp(min=self.eps) * self.scale
140
 
141
 
142
  logger = logging.get_logger(__name__)
compatibility_patches.py ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Compatibility patches for huggingface_hub and diffusers
3
+ """
4
+ import sys
5
+ import importlib
6
+ from functools import wraps
7
+
8
+ # Check if huggingface_hub is installed
9
+ try:
10
+ import huggingface_hub
11
+
12
+ # Add the cached_download function if it doesn't exist
13
+ if not hasattr(huggingface_hub, 'cached_download'):
14
+ def cached_download(*args, **kwargs):
15
+ """Compatibility function to replace cached_download"""
16
+ # Use the newer hf_hub_download function
17
+ return huggingface_hub.hf_hub_download(*args, **kwargs)
18
+
19
+ # Add the missing function to the module
20
+ huggingface_hub.cached_download = cached_download
21
+
22
+ except ImportError:
23
+ print("huggingface_hub not found, skipping patch")
24
+
25
+ # Patch for diffusers dynamic_modules_utils.py
26
+ try:
27
+ import diffusers.utils.dynamic_modules_utils as dmu
28
+
29
+ # Store the original import statement
30
+ original_import = dmu.__import__
31
+
32
+ # Define a wrapper for __import__
33
+ @wraps(original_import)
34
+ def patched_import(name, *args, **kwargs):
35
+ try:
36
+ return original_import(name, *args, **kwargs)
37
+ except ImportError as e:
38
+ if 'cached_download' in str(e) and name == 'huggingface_hub':
39
+ # Import the module without the missing function
40
+ mod = importlib.import_module(name)
41
+
42
+ # Add the missing function
43
+ if not hasattr(mod, 'cached_download'):
44
+ def cached_download(*args, **kwargs):
45
+ return mod.hf_hub_download(*args, **kwargs)
46
+
47
+ mod.cached_download = cached_download
48
+
49
+ return mod
50
+ raise
51
+
52
+ # Apply the patch
53
+ dmu.__import__ = patched_import
54
+
55
+ except ImportError:
56
+ print("diffusers.utils.dynamic_modules_utils not found, skipping patch")
patch_diffusers.sh ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+ # Run this script to patch the dynamic_modules_utils.py file
3
+
4
+ SITE_PACKAGES=$(python -c "import site; print(site.getsitepackages()[0])")
5
+ DMU_FILE="$SITE_PACKAGES/diffusers/utils/dynamic_modules_utils.py"
6
+
7
+ # Create a backup
8
+ cp "$DMU_FILE" "${DMU_FILE}.bak"
9
+
10
+ # Replace the import statement
11
+ sed -i 's/from huggingface_hub import cached_download, hf_hub_download, model_info/from huggingface_hub import hf_hub_download, model_info\n\ndef cached_download(*args, **kwargs):\n """Compatibility wrapper for hf_hub_download"""\n return hf_hub_download(*args, **kwargs)/g' "$DMU_FILE"
12
+
13
+ echo "Patched $DMU_FILE"
pipeline_stable_diffusion_custom.py CHANGED
@@ -4,29 +4,110 @@ import inspect
4
  from typing import Any, Callable, Dict, List, Optional, Union
5
 
6
  import torch
 
7
  from packaging import version
8
  from transformers import CLIPImageProcessor, CLIPTextModel, CLIPTokenizer, CLIPVisionModelWithProjection
9
 
 
 
 
 
 
10
  from diffusers.configuration_utils import FrozenDict
11
  from diffusers.image_processor import PipelineImageInput, VaeImageProcessor
12
 
13
- from diffusers.loaders import IPAdapterMixin, LoraLoaderMixin, TextualInversionLoaderMixin, FromSingleFileMixin
14
-
15
- from diffusers.models import AutoencoderKL, ImageProjection, UNet2DConditionModel
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  from diffusers.models.lora import adjust_lora_scale_text_encoder
17
  from diffusers.schedulers import KarrasDiffusionSchedulers
18
- from diffusers.utils import (
19
- USE_PEFT_BACKEND,
20
- deprecate,
21
- logging,
22
- replace_example_docstring,
23
- scale_lora_layers,
24
- unscale_lora_layers,
25
- )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
  from diffusers.utils.torch_utils import randn_tensor
27
- from diffusers.pipelines.pipeline_utils import DiffusionPipeline, StableDiffusionMixin
28
- from diffusers.pipelines.stable_diffusion.pipeline_output import StableDiffusionPipelineOutput
29
- from diffusers.pipelines.stable_diffusion.safety_checker import StableDiffusionSafetyChecker
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
 
32
  logger = logging.get_logger(__name__) # pylint: disable=invalid-name
@@ -104,6 +185,7 @@ def retrieve_timesteps(
104
  return timesteps, num_inference_steps
105
 
106
 
 
107
  class StableDiffusionPipeline(
108
  DiffusionPipeline,
109
  StableDiffusionMixin,
 
4
  from typing import Any, Callable, Dict, List, Optional, Union
5
 
6
  import torch
7
+ import torch.nn as nn
8
  from packaging import version
9
  from transformers import CLIPImageProcessor, CLIPTextModel, CLIPTokenizer, CLIPVisionModelWithProjection
10
 
11
+ # Import ModelMixin and ConfigMixin for our custom classes
12
+ from diffusers.configuration_utils import ConfigMixin
13
+ from diffusers.models.modeling_utils import ModelMixin
14
+ from diffusers.utils import BaseOutput
15
+
16
  from diffusers.configuration_utils import FrozenDict
17
  from diffusers.image_processor import PipelineImageInput, VaeImageProcessor
18
 
19
+ # Modified to handle older diffusers versions (0.21.4)
20
+ try:
21
+ from diffusers.loaders import IPAdapterMixin, LoraLoaderMixin, TextualInversionLoaderMixin, FromSingleFileMixin
22
+ except ImportError:
23
+ # Create dummy classes for missing imports
24
+ from diffusers.loaders import LoraLoaderMixin, TextualInversionLoaderMixin
25
+
26
+ # Define dummy mixins for backward compatibility
27
+ class IPAdapterMixin:
28
+ """Dummy IPAdapterMixin for compatibility with older diffusers versions."""
29
+ pass
30
+
31
+ class FromSingleFileMixin:
32
+ """Dummy FromSingleFileMixin for compatibility with older diffusers versions."""
33
+ pass
34
+
35
+ # Import models with fallback for older diffusers versions
36
+ try:
37
+ from diffusers.models import AutoencoderKL, ImageProjection, UNet2DConditionModel
38
+ except ImportError:
39
+ from diffusers.models import AutoencoderKL, UNet2DConditionModel
40
+
41
+ # Define dummy class for compatibility
42
+ class ImageProjection(nn.Module):
43
+ """Dummy ImageProjection for compatibility with older diffusers versions."""
44
+ def __init__(self, image_embed_dim=None, cross_attention_dim=None):
45
+ super().__init__()
46
+ self.image_embed_dim = image_embed_dim
47
+ self.cross_attention_dim = cross_attention_dim
48
  from diffusers.models.lora import adjust_lora_scale_text_encoder
49
  from diffusers.schedulers import KarrasDiffusionSchedulers
50
+ # Check if USE_PEFT_BACKEND is available in diffusers
51
+ try:
52
+ from diffusers.utils import (
53
+ USE_PEFT_BACKEND,
54
+ deprecate,
55
+ logging,
56
+ replace_example_docstring,
57
+ scale_lora_layers,
58
+ unscale_lora_layers,
59
+ )
60
+ except ImportError:
61
+ from diffusers.utils import deprecate, logging
62
+
63
+ # Define placeholders for missing utilities
64
+ USE_PEFT_BACKEND = False
65
+
66
+ def replace_example_docstring(example_docstring):
67
+ """Dummy function for compatibility with older diffusers versions."""
68
+ def decorator(fn):
69
+ return fn
70
+ return decorator
71
+
72
+ def scale_lora_layers(model, weight):
73
+ """Dummy function for compatibility with older diffusers versions."""
74
+ pass
75
+
76
+ def unscale_lora_layers(model, weight):
77
+ """Dummy function for compatibility with older diffusers versions."""
78
+ pass
79
  from diffusers.utils.torch_utils import randn_tensor
80
+
81
+ # Import pipeline utils with fallbacks
82
+ try:
83
+ from diffusers.pipelines.pipeline_utils import DiffusionPipeline, StableDiffusionMixin
84
+ except ImportError:
85
+ from diffusers.pipelines.pipeline_utils import DiffusionPipeline
86
+
87
+ # Create a minimal StableDiffusionMixin for compatibility
88
+ class StableDiffusionMixin:
89
+ """Custom implementation of StableDiffusionMixin for compatibility with older diffusers versions."""
90
+ pass
91
+
92
+ # Import pipeline output and safety checker
93
+ try:
94
+ from diffusers.pipelines.stable_diffusion.pipeline_output import StableDiffusionPipelineOutput
95
+ from diffusers.pipelines.stable_diffusion.safety_checker import StableDiffusionSafetyChecker
96
+ except ImportError:
97
+ # Define custom StableDiffusionPipelineOutput for compatibility
98
+ class StableDiffusionPipelineOutput(BaseOutput):
99
+ """Custom implementation for compatibility with older diffusers versions."""
100
+ images: torch.FloatTensor
101
+ nsfw_content_detected: Optional[List[bool]]
102
+
103
+ # Define custom StableDiffusionSafetyChecker for compatibility
104
+ class StableDiffusionSafetyChecker(ModelMixin, ConfigMixin):
105
+ """Custom implementation for compatibility with older diffusers versions."""
106
+ def __init__(self, *args, **kwargs):
107
+ super().__init__()
108
+
109
+ def forward(self, images, clip_input):
110
+ return images, [False] * len(images)
111
 
112
 
113
  logger = logging.get_logger(__name__) # pylint: disable=invalid-name
 
185
  return timesteps, num_inference_steps
186
 
187
 
188
+ # Try to determine what mixins are available in the installed diffusers version
189
  class StableDiffusionPipeline(
190
  DiffusionPipeline,
191
  StableDiffusionMixin,
requirements.txt CHANGED
@@ -1,12 +1,12 @@
1
- gradio>=4.0.0
2
  requests>=2.30.0
3
  tqdm>=4.66.0
4
  torch==2.0.1
5
  transformers>=4.30.0,<4.36.0
6
  diffusers==0.21.4
7
- huggingface_hub==0.23.1
8
  accelerate>=0.24.0
9
  einops>=0.7.0
10
  omegaconf>=2.0.0
11
  librosa>=0.9.0
12
- soundfile>=0.12.0
 
1
+ gradio>=4.0.0,<5.0.0
2
  requests>=2.30.0
3
  tqdm>=4.66.0
4
  torch==2.0.1
5
  transformers>=4.30.0,<4.36.0
6
  diffusers==0.21.4
7
+ huggingface_hub==0.16.4
8
  accelerate>=0.24.0
9
  einops>=0.7.0
10
  omegaconf>=2.0.0
11
  librosa>=0.9.0
12
+ soundfile>=0.12.0
test_imports.py ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import sys
2
+ print("Python version:", sys.version)
3
+ print("Python path:", sys.path)
4
+
5
+ try:
6
+ import diffusers
7
+ print("Diffusers version:", diffusers.__version__)
8
+
9
+ # Try importing specific classes from diffusers
10
+ from diffusers.configuration_utils import FrozenDict
11
+ print("Successfully imported FrozenDict")
12
+
13
+ from diffusers.loaders import IPAdapterMixin, LoraLoaderMixin, TextualInversionLoaderMixin
14
+ print("Successfully imported mixins")
15
+
16
+ from diffusers.models import AutoencoderKL, UNet2DConditionModel
17
+ print("Successfully imported models")
18
+
19
+ # Try pipeline-specific imports
20
+ from diffusers.pipelines.pipeline_utils import DiffusionPipeline, StableDiffusionMixin
21
+ print("Successfully imported pipeline utils")
22
+
23
+ from diffusers.pipelines.stable_diffusion.pipeline_output import StableDiffusionPipelineOutput
24
+ print("Successfully imported pipeline output")
25
+
26
+ except ImportError as e:
27
+ print("Import error:", e)
28
+ import traceback
29
+ traceback.print_exc()
test_pipeline.py ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Simple script to test if our fixes for diffusers compatibility are working.
3
+ This script doesn't use Gradio or the full web interface.
4
+ """
5
+
6
+ import os
7
+ import torch
8
+ import numpy as np
9
+ from PIL import Image
10
+
11
+ # Import our custom components
12
+ from unet2d_custom import UNet2DConditionModel
13
+ from pipeline_stable_diffusion_custom import StableDiffusionPipeline
14
+
15
+ def main():
16
+ print("Testing SonicDiffusion pipeline components...")
17
+
18
+ # Check imports
19
+ print("Imports successful!")
20
+
21
+ # Check if CUDA is available
22
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
23
+ print(f"Using device: {device}")
24
+
25
+ # Try to initialize a pipeline (without loading weights, just to test the class structure)
26
+ try:
27
+ # This will just test if the pipeline can be initialized, not if it works correctly
28
+ print("Testing pipeline initialization...")
29
+ pipeline = StableDiffusionPipeline(
30
+ vae=None,
31
+ text_encoder=None,
32
+ tokenizer=None,
33
+ unet=None,
34
+ scheduler=None,
35
+ safety_checker=None,
36
+ feature_extractor=None,
37
+ )
38
+ print("Pipeline initialization successful!")
39
+ except Exception as e:
40
+ print(f"Error initializing pipeline: {e}")
41
+
42
+ print("Tests completed.")
43
+
44
+ if __name__ == "__main__":
45
+ main()
transformer_2d_custom.py CHANGED
@@ -11,9 +11,55 @@ from diffusers.configuration_utils import ConfigMixin, register_to_config
11
  from diffusers.utils import BaseOutput, deprecate, is_torch_version, logging
12
  from attention_custom import BasicTransformerBlock
13
 
14
- from diffusers.models.embeddings import ImagePositionalEmbeddings, PatchEmbed, PixArtAlphaTextProjection
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  from diffusers.models.modeling_utils import ModelMixin
16
- from diffusers.models.normalization import AdaLayerNormSingle
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
 
18
 
19
  logger = logging.get_logger(__name__) # pylint: disable=invalid-name
 
11
  from diffusers.utils import BaseOutput, deprecate, is_torch_version, logging
12
  from attention_custom import BasicTransformerBlock
13
 
14
+ # Import embeddings with fallbacks
15
+ try:
16
+ from diffusers.models.embeddings import ImagePositionalEmbeddings, PatchEmbed, PixArtAlphaTextProjection
17
+ except ImportError:
18
+ # Define custom classes for compatibility
19
+ class ImagePositionalEmbeddings(nn.Module):
20
+ """Custom implementation for compatibility with older diffusers versions."""
21
+ def __init__(self, *args, **kwargs):
22
+ super().__init__()
23
+ self.position_embeddings = nn.Parameter(torch.zeros(1, 1, 1, 1))
24
+
25
+ def forward(self, x):
26
+ return x + self.position_embeddings
27
+
28
+ class PatchEmbed(nn.Module):
29
+ """Custom implementation for compatibility with older diffusers versions."""
30
+ def __init__(self, *args, **kwargs):
31
+ super().__init__()
32
+ self.proj = nn.Conv2d(3, 1024, kernel_size=1)
33
+
34
+ def forward(self, x):
35
+ return self.proj(x).flatten(2).transpose(1, 2)
36
+
37
+ class PixArtAlphaTextProjection(nn.Module):
38
+ """Custom implementation for compatibility with older diffusers versions."""
39
+ def __init__(self, *args, **kwargs):
40
+ super().__init__()
41
+
42
+ def forward(self, x):
43
+ return x
44
+
45
  from diffusers.models.modeling_utils import ModelMixin
46
+
47
+ # Import normalization with fallbacks
48
+ try:
49
+ from diffusers.models.normalization import AdaLayerNormSingle
50
+ except ImportError:
51
+ # Define a custom AdaLayerNormSingle
52
+ class AdaLayerNormSingle(nn.Module):
53
+ """Custom implementation for compatibility with older diffusers versions."""
54
+ def __init__(self, embedding_dim, emb_dim=None):
55
+ super().__init__()
56
+ self.emb_layer = nn.Linear(emb_dim or embedding_dim, embedding_dim)
57
+ self.norm = nn.LayerNorm(embedding_dim, elementwise_affine=False)
58
+
59
+ def forward(self, x, emb):
60
+ shift = self.emb_layer(emb).unsqueeze(1)
61
+ x = self.norm(x)
62
+ return x + shift
63
 
64
 
65
  logger = logging.get_logger(__name__) # pylint: disable=invalid-name
unet2d_custom.py CHANGED
@@ -8,10 +8,32 @@ import torch.nn as nn
8
  import torch.utils.checkpoint
9
 
10
  from diffusers.configuration_utils import ConfigMixin, register_to_config
11
- from diffusers.loaders import PeftAdapterMixin, UNet2DConditionLoadersMixin
12
- #from diffusers.loaders import UNet2DConditionLoadersMixin
13
-
14
- from diffusers.utils import USE_PEFT_BACKEND, BaseOutput, deprecate, logging, scale_lora_layers, unscale_lora_layers
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  from diffusers.models.activations import get_activation
16
 
17
  from diffusers.models.attention_processor import (
@@ -22,18 +44,57 @@ from diffusers.models.attention_processor import (
22
  AttnAddedKVProcessor,
23
  AttnProcessor,
24
  )
25
- from diffusers.models.embeddings import (
26
- GaussianFourierProjection,
27
- #GLIGENTextBoundingboxProjection,
28
- ImageHintTimeEmbedding,
29
- ImageProjection,
30
- ImageTimeEmbedding,
31
- TextImageProjection,
32
- TextImageTimeEmbedding,
33
- TextTimeEmbedding,
34
- TimestepEmbedding,
35
- Timesteps,
36
- )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
  from diffusers.models.modeling_utils import ModelMixin
38
 
39
  from unet_2d_blocks_custom import (
@@ -60,6 +121,7 @@ class UNet2DConditionOutput(BaseOutput):
60
  sample: torch.FloatTensor = None
61
 
62
 
 
63
  class UNet2DConditionModel(ModelMixin, ConfigMixin, UNet2DConditionLoadersMixin, PeftAdapterMixin):
64
  r"""
65
  A conditional 2D UNet model that takes a noisy sample, conditional state, and a timestep and returns a sample
 
8
  import torch.utils.checkpoint
9
 
10
  from diffusers.configuration_utils import ConfigMixin, register_to_config
11
+ # Modified to handle older diffusers versions (0.21.4)
12
+ try:
13
+ from diffusers.loaders import PeftAdapterMixin, UNet2DConditionLoadersMixin
14
+ except ImportError:
15
+ from diffusers.loaders import UNet2DConditionLoadersMixin
16
+
17
+ # Define dummy mixin for backward compatibility
18
+ class PeftAdapterMixin:
19
+ """Dummy PeftAdapterMixin for compatibility with older diffusers versions."""
20
+ pass
21
+
22
+ # Check if USE_PEFT_BACKEND is available in diffusers
23
+ try:
24
+ from diffusers.utils import USE_PEFT_BACKEND, BaseOutput, deprecate, logging, scale_lora_layers, unscale_lora_layers
25
+ except ImportError:
26
+ from diffusers.utils import BaseOutput, deprecate, logging
27
+ # Define placeholders for missing utilities
28
+ USE_PEFT_BACKEND = False
29
+
30
+ def scale_lora_layers(model, weight):
31
+ """Dummy function for compatibility with older diffusers versions."""
32
+ pass
33
+
34
+ def unscale_lora_layers(model, weight):
35
+ """Dummy function for compatibility with older diffusers versions."""
36
+ pass
37
  from diffusers.models.activations import get_activation
38
 
39
  from diffusers.models.attention_processor import (
 
44
  AttnAddedKVProcessor,
45
  AttnProcessor,
46
  )
47
+ try:
48
+ from diffusers.models.embeddings import (
49
+ GaussianFourierProjection,
50
+ GLIGENTextBoundingboxProjection,
51
+ ImageHintTimeEmbedding,
52
+ ImageProjection,
53
+ ImageTimeEmbedding,
54
+ TextImageProjection,
55
+ TextImageTimeEmbedding,
56
+ TextTimeEmbedding,
57
+ TimestepEmbedding,
58
+ Timesteps,
59
+ )
60
+ except ImportError:
61
+ # For older diffusers versions
62
+ from diffusers.models.embeddings import (
63
+ GaussianFourierProjection,
64
+ ImageProjection,
65
+ TextTimeEmbedding,
66
+ TimestepEmbedding,
67
+ Timesteps,
68
+ )
69
+
70
+ # Define missing classes for compatibility
71
+ class GLIGENTextBoundingboxProjection(nn.Module):
72
+ """Dummy class for compatibility with older diffusers versions."""
73
+ def __init__(self, positive_len=None, out_dim=None, feature_type=None):
74
+ super().__init__()
75
+ self.positive_len = positive_len
76
+ self.out_dim = out_dim
77
+ self.feature_type = feature_type
78
+
79
+ class ImageHintTimeEmbedding(nn.Module):
80
+ """Dummy class for compatibility with older diffusers versions."""
81
+ def __init__(self, image_embed_dim=None, time_embed_dim=None):
82
+ super().__init__()
83
+
84
+ class ImageTimeEmbedding(nn.Module):
85
+ """Dummy class for compatibility with older diffusers versions."""
86
+ def __init__(self, image_embed_dim=None, time_embed_dim=None):
87
+ super().__init__()
88
+
89
+ class TextImageProjection(nn.Module):
90
+ """Dummy class for compatibility with older diffusers versions."""
91
+ def __init__(self, text_embed_dim=None, image_embed_dim=None, cross_attention_dim=None):
92
+ super().__init__()
93
+
94
+ class TextImageTimeEmbedding(nn.Module):
95
+ """Dummy class for compatibility with older diffusers versions."""
96
+ def __init__(self, text_embed_dim=None, image_embed_dim=None, time_embed_dim=None):
97
+ super().__init__()
98
  from diffusers.models.modeling_utils import ModelMixin
99
 
100
  from unet_2d_blocks_custom import (
 
121
  sample: torch.FloatTensor = None
122
 
123
 
124
+ # Modified for compatibility with older diffusers
125
  class UNet2DConditionModel(ModelMixin, ConfigMixin, UNet2DConditionLoadersMixin, PeftAdapterMixin):
126
  r"""
127
  A conditional 2D UNet model that takes a noisy sample, conditional state, and a timestep and returns a sample
unet_2d_blocks_custom.py CHANGED
@@ -8,24 +8,136 @@ import torch.nn.functional as F
8
  from torch import nn
9
 
10
  from diffusers.utils import deprecate, is_torch_version, logging
11
- from diffusers.utils.torch_utils import apply_freeu
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
  from diffusers.models.activations import get_activation
14
  from diffusers.models.attention_processor import Attention, AttnAddedKVProcessor, AttnAddedKVProcessor2_0
15
- from diffusers.models.normalization import AdaGroupNorm
16
-
17
- from diffusers.models.resnet import (
18
- Downsample2D,
19
- FirDownsample2D,
20
- FirUpsample2D,
21
- KDownsample2D,
22
- KUpsample2D,
23
- ResnetBlock2D,
24
- ResnetBlockCondNorm2D,
25
- Upsample2D,
26
- )
27
-
28
- from diffusers.models.transformers.dual_transformer_2d import DualTransformer2DModel
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
  from transformer_2d_custom import Transformer2DModel
30
 
31
  #from diffusers.models.transformers.transformer_2d import Transformer2DModel
 
8
  from torch import nn
9
 
10
  from diffusers.utils import deprecate, is_torch_version, logging
11
+
12
+ # Import apply_freeu or define it if not available
13
+ try:
14
+ from diffusers.utils.torch_utils import apply_freeu
15
+ except ImportError:
16
+ # Define a custom apply_freeu function for compatibility
17
+ def apply_freeu(
18
+ feats: torch.Tensor,
19
+ hidden_states: torch.Tensor,
20
+ res_hidden_states: torch.Tensor,
21
+ s1: float,
22
+ s2: float,
23
+ b1: float,
24
+ b2: float,
25
+ ) -> torch.Tensor:
26
+ """
27
+ Custom implementation of FreeU for older diffusers versions.
28
+ See https://github.com/ChenyangSi/FreeU for more details.
29
+
30
+ Args:
31
+ feats: Features at the current layer
32
+ hidden_states: Hidden states from the previous layer
33
+ res_hidden_states: Residual hidden states from the previous layer
34
+ s1: Scaling factor for frequency components
35
+ s2: Scaling factor for frequency components
36
+ b1: Scaling factor for original hidden states
37
+ b2: Scaling factor for original hidden states
38
+
39
+ Returns:
40
+ The processed feature map
41
+ """
42
+ if all(param is None for param in [s1, s2, b1, b2]):
43
+ return hidden_states
44
+
45
+ # Simple implementation that just passes through the hidden states unchanged
46
+ # This maintains compatibility without the actual FreeU feature
47
+ return hidden_states
48
 
49
  from diffusers.models.activations import get_activation
50
  from diffusers.models.attention_processor import Attention, AttnAddedKVProcessor, AttnAddedKVProcessor2_0
51
+
52
+ # Handle missing AdaGroupNorm
53
+ try:
54
+ from diffusers.models.normalization import AdaGroupNorm
55
+ except ImportError:
56
+ # Define a custom AdaGroupNorm class if it's not available
57
+ class AdaGroupNorm(nn.Module):
58
+ """Custom implementation of AdaGroupNorm for compatibility with older diffusers versions."""
59
+
60
+ def __init__(self, embedding_dim, num_groups=32, eps=1e-5):
61
+ super().__init__()
62
+ self.num_groups = num_groups
63
+ self.eps = eps
64
+ self.embedding_dim = embedding_dim
65
+
66
+ self.linear = nn.Linear(embedding_dim, embedding_dim * 2)
67
+
68
+ def forward(self, x, emb):
69
+ # Simple implementation that falls back to GroupNorm
70
+ emb = self.linear(emb)
71
+ emb = emb[:, :, None, None]
72
+ scale, shift = emb.chunk(2, dim=1)
73
+
74
+ # Use standard GroupNorm
75
+ x = nn.functional.group_norm(x, self.num_groups, eps=self.eps)
76
+ # Apply scale and shift
77
+ return x * (1 + scale) + shift
78
+
79
+ # Import resnet components with fallbacks for older diffusers versions
80
+ try:
81
+ from diffusers.models.resnet import (
82
+ Downsample2D,
83
+ FirDownsample2D,
84
+ FirUpsample2D,
85
+ KDownsample2D,
86
+ KUpsample2D,
87
+ ResnetBlock2D,
88
+ ResnetBlockCondNorm2D,
89
+ Upsample2D,
90
+ )
91
+ except ImportError:
92
+ # Import what's available
93
+ from diffusers.models.resnet import (
94
+ Downsample2D,
95
+ FirDownsample2D,
96
+ FirUpsample2D,
97
+ KDownsample2D,
98
+ KUpsample2D,
99
+ ResnetBlock2D,
100
+ Upsample2D,
101
+ )
102
+
103
+ # Define a custom ResnetBlockCondNorm2D class
104
+ class ResnetBlockCondNorm2D(nn.Module):
105
+ """
106
+ Resnet block with conditional normalization for compatibility with older diffusers versions.
107
+
108
+ Args:
109
+ in_channels (int): Number of input channels.
110
+ out_channels (int): Number of output channels.
111
+ temb_channels (int): Number of timestep embedding channels.
112
+ groups (int, optional): Number of groups for GroupNorm. Defaults to 32.
113
+ eps (float, optional): Epsilon for GroupNorm. Defaults to 1e-5.
114
+ """
115
+ def __init__(
116
+ self,
117
+ *args,
118
+ **kwargs
119
+ ):
120
+ super().__init__()
121
+ # Use ResnetBlock2D as fallback
122
+ self.block = ResnetBlock2D(*args, **kwargs)
123
+
124
+ def forward(self, hidden_states, temb=None, scale=None):
125
+ return self.block(hidden_states, temb)
126
+
127
+ # Import transformer models
128
+ try:
129
+ from diffusers.models.transformers.dual_transformer_2d import DualTransformer2DModel
130
+ except ImportError:
131
+ # Define a custom DualTransformer2DModel for older diffusers versions
132
+ class DualTransformer2DModel(nn.Module):
133
+ """Dummy implementation for older diffusers versions"""
134
+ def __init__(self, *args, **kwargs):
135
+ super().__init__()
136
+
137
+ def forward(self, *args, **kwargs):
138
+ raise NotImplementedError("DualTransformer2DModel is not available in this version of diffusers")
139
+
140
+ # Use our custom Transformer2DModel
141
  from transformer_2d_custom import Transformer2DModel
142
 
143
  #from diffusers.models.transformers.transformer_2d import Transformer2DModel