Spaces:

ttoosi
/

Generative_Inference_Faces

Sleeping

App Files Files Community

Tahereh commited on Nov 9, 2025

Commit

420f791

1 Parent(s): 0b968b4

Update to Generative Inference for Psychiatry Demo: add Noise stimulus, update parameters, fix model loading, and improve UI

Browse files

Files changed (18) hide show

.DS_Store +0 -0
.gitignore +30 -0
.smbdeleteAAA29de78c16 +0 -0
DIFFERENCES.md +137 -0
README.md +1 -1
TROUBLESHOOTING.md +140 -0
__pycache__/inference.cpython-311.pyc +0 -0
app.py +121 -39
face_vase_black.png +0 -0
huggingface-metadata.json +1 -1
inference.py +485 -729
logs/model_loading_resnet50_robust_face.log +4 -41
models/resnet50_imagenet_L2_eps_0.50_checkpoint150.pt +0 -3
models/resnet50_robust.pt +0 -3
models/resnet50_robust_face_100_checkpoint.pt +0 -3
models/robust_resnet50.pt +0 -3
models/standard_resnet50.pt +0 -3
stimuli/RandomizedPhaseOvalGray.png +0 -0

.DS_Store ADDED Viewed

Binary file (6.15 kB). View file

.gitignore ADDED Viewed

	@@ -0,0 +1,30 @@

+# Model checkpoints (downloaded automatically)
+models/*.pt
+models/*.ckpt
+# Python cache
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+# Logs
+logs/
+*.log
+# Environment
+.env
+.venv
+env/
+venv/
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+# OS
+.DS_Store
+Thumbs.db

.smbdeleteAAA29de78c16 ADDED Viewed

Binary file (10.9 kB). View file

DIFFERENCES.md ADDED Viewed

	@@ -0,0 +1,137 @@

+# Differences Between Reference Code and Current Implementation
+## Critical Differences Affecting Results
+### 1. **First Iteration Handling** ⚠️ **CRITICAL**
+**Reference Code:**
+```python
+if itr == 0:
+    # Don't add priors or diffusion noise to the first iteration
+    output = model(image_tensor)
+    # ... just get predictions, no gradient update
+else:
+    # Calculate loss and gradients
+    if loss_infer == 'PGDD':
+        loss = torch.nn.functional.mse_loss(features, noisy_features)
+        grad = torch.autograd.grad(loss, image_tensor)[0]
+        adjusted_grad = inferstep.step(image_tensor, grad)
+    # ... apply gradient and noise
+```
+**Current Implementation:**
+- **MISSING**: No check for `itr == 0` or `i == 0`
+- Applies gradients and diffusion noise from the very first iteration
+- This causes different starting behavior
+### 2. **Model Extraction for PGDD**
+**Reference Code:**
+```python
+new_model = extract_middle_layers(model.module, top_layer)
+```
+**Current Implementation:**
+- Complex logic to handle Sequential models with normalizers
+- Extracts from `model[1]` if Sequential, otherwise from `model`
+- May handle DataParallel differently
+### 3. **Gradient Calculation**
+**Reference Code:**
+```python
+grad = torch.autograd.grad(loss, image_tensor)[0]  # No retain_graph for PGDD
+```
+**Current Implementation:**
+- Same for PGDD (no retain_graph)
+- But uses `retain_graph=True` for IncreaseConfidence
+### 4. **Normalization Handling**
+**Reference Code:**
+- Normalization is applied in the transform at the beginning
+- `inference_normalization` controls whether transform includes normalization
+- Model forward pass uses the already-normalized tensor
+**Current Implementation:**
+- Complex logic checking if model is Sequential with NormalizeByChannelMeanStd
+- May apply normalization multiple times or inconsistently
+- Different paths for sequential vs non-sequential models
+### 5. **Variable Naming and Structure**
+**Reference Code:**
+- Uses `image_tensor` throughout the loop
+- Directly modifies `image_tensor` with `requires_grad=True`
+**Current Implementation:**
+- Creates separate `x = image_tensor.clone().detach().requires_grad_(True)`
+- Uses `x` in the loop instead of `image_tensor`
+### 6. **Loss Function for IncreaseConfidence**
+**Reference Code:**
+```python
+loss = calculate_loss(features, least_confident_classes[0], loss_function)
+# Uses CrossEntropyLoss or MSELoss based on loss_function
+```
+**Current Implementation:**
+```python
+# Creates one-hot targets and uses MSE on softmax outputs
+loss = loss + F.mse_loss(F.softmax(output, dim=1), one_hot)
+```
+- Different loss calculation method
+- Uses MSE on softmax probabilities vs CrossEntropy on logits
+### 7. **Diffusion Noise Application**
+**Reference Code:**
+```python
+if itr == 0:
+    # Skip noise
+else:
+    diffusion_noise = diffusion_noise_ratio * torch.randn_like(image_tensor).cuda()
+    if loss_infer == 'GradModulation':
+        image_tensor = inferstep.project(
+            image_tensor.clone() +
+            adjusted_grad * grad_modulation +
+            diffusion_noise * grad_modulation
+        )
+    else:
+        image_tensor = inferstep.project(
+            image_tensor.clone() + adjusted_grad + diffusion_noise
+        )
+```
+**Current Implementation:**
+- Always applies diffusion noise (no `itr == 0` check)
+- Applies noise in all iterations including the first
+### 8. **Model Forward Pass in Loop**
+**Reference Code:**
+```python
+if inference_config['misc_info'].get('smooth_inference', False):
+    # Smooth inference logic
+else:
+    new_model.zero_grad()
+    features = new_model(image_tensor)
+```
+**Current Implementation:**
+```python
+x.grad = None  # Instead of new_model.zero_grad()
+if config['loss_infer'] == 'Prior-Guided Drift Diffusion' and layer_model is not None:
+    output = layer_model(x)
+else:
+    output = model(x)
+```
+## Summary of Impact
+1. **First iteration difference**: Most critical - reference skips gradient update on iteration 0
+2. **Normalization**: Different application may cause numerical differences
+3. **Loss calculation**: Different methods for IncreaseConfidence
+4. **Model extraction**: May extract different layers due to Sequential handling
+## Recommended Fixes
+1. Add `if i == 0:` check to skip gradient update on first iteration
+2. Simplify model extraction to match reference: `extract_middle_layers(model.module, top_layer)`
+3. Align loss calculation for IncreaseConfidence with reference
+4. Ensure normalization is applied consistently

README.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-title: Generative Inference Demo
 emoji: 🧠
 colorFrom: indigo
 colorTo: purple

 ---
+title: Generative Inference for Psychiatry Demo
 emoji: 🧠
 colorFrom: indigo
 colorTo: purple

TROUBLESHOOTING.md ADDED Viewed

	@@ -0,0 +1,140 @@

+# Troubleshooting: Why It Works on Hugging Face Spaces But Not Locally
+## Common Issues and Solutions
+### 1. **Missing Dependencies** ⚠️ (Most Common)
+**Problem**: The required Python packages are not installed locally.
+**Solution**: Install all dependencies:
+```bash
+cd /home/tahereh/engram/users/Tahereh/Codes/Public_Codes/Generative_Inference_Faces
+pip install -r requirements.txt
+```
+**Required packages**:
+- `torch` and `torchvision` (PyTorch)
+- `gradio` (for the web interface)
+- `numpy`, `pillow` (PIL), `matplotlib`
+- `requests`, `tqdm`, `huggingface_hub`
+### 2. **GPU Decorator** ✅ (Fixed)
+**Problem**: The `@GPU` decorator from Hugging Face Spaces is not available locally.
+**Solution**: The code now automatically handles this:
+- On Hugging Face Spaces: Uses the `spaces.GPU` decorator
+- Locally: Uses a no-op decorator (GPU detection is automatic via PyTorch)
+**Status**: ✅ Fixed in the code
+### 3. **Port Configuration** ✅ (Fixed)
+**Problem**: Port configuration was inconsistent between local and Spaces environments.
+**Solution**: The code now:
+- Uses port 7860 by default (same as Spaces)
+- Allows custom port via `--port` argument
+- Automatically detects Hugging Face Spaces environment
+**Status**: ✅ Fixed in the code
+### 4. **Model Files Not Downloaded**
+**Problem**: Model checkpoint files may not be downloaded yet.
+**Solution**: The code will automatically download models on first run, but you can verify:
+```bash
+ls models/
+```
+Expected files:
+- `resnet50_robust.pt`
+- `standard_resnet50.pt` (optional)
+- `resnet50_robust_face_100_checkpoint.pt` (optional)
+### 5. **Missing Stimuli Images**
+**Problem**: Example images may be missing.
+**Solution**: Verify stimuli directory exists:
+```bash
+ls stimuli/
+```
+All example images should be present for the demo to work fully.
+### 6. **CUDA/GPU Issues**
+**Problem**: GPU may not be available or configured correctly.
+**Solution**: The code automatically detects available hardware:
+- CUDA (NVIDIA GPUs)
+- MPS (Apple Silicon)
+- CPU (fallback)
+Check your setup:
+```python
+import torch
+print("CUDA available:", torch.cuda.is_available())
+print("Device:", torch.device("cuda" if torch.cuda.is_available() else "cpu"))
+```
+### 7. **Python Version**
+**Problem**: Incompatible Python version.
+**Solution**: Use Python 3.8+ (tested with 3.11.5):
+```bash
+python --version
+```
+## Quick Start Guide
+1. **Install dependencies**:
+   ```bash
+   pip install -r requirements.txt
+   ```
+2. **Run the app**:
+   ```bash
+   python app.py
+   ```
+   Or with a custom port:
+   ```bash
+   python app.py --port 8080
+   ```
+3. **Access the web interface**:
+   - Open your browser to `http://localhost:7860`
+   - Or the port you specified
+## Differences Between Hugging Face Spaces and Local
+| Feature | Hugging Face Spaces | Local |
+|---------|-------------------|-------|
+| GPU Decorator | `@spaces.GPU` available | No-op decorator (automatic GPU) |
+| Port | Set via `PORT` env var | Default 7860, or `--port` arg |
+| Dependencies | Pre-installed | Must install manually |
+| Environment | `SPACE_ID` env var set | Not set |
+| Model Storage | Persistent storage | Local `models/` directory |
+## Testing the Fixes
+After applying the fixes, test with:
+```bash
+# Check imports work
+python -c "import gradio, torch, numpy, PIL; print('All imports OK')"
+# Run the app
+python app.py --port 7860
+```
+## Still Having Issues?
+1. **Check error messages**: Look for specific import errors or file not found errors
+2. **Verify Python environment**: Make sure you're using the correct virtual environment
+3. **Check file permissions**: Ensure the `models/` and `stimuli/` directories are writable
+4. **Review logs**: Check the `logs/` directory for model loading issues

__pycache__/inference.cpython-311.pyc ADDED Viewed

Binary file (40.8 kB). View file

app.py CHANGED Viewed

@@ -4,10 +4,13 @@ import numpy as np
 from PIL import Image
 try:
     from spaces import GPU
 except ImportError:
     # Define a no-op decorator if running locally
     def GPU(func):
         return func
 import os
 import argparse
@@ -15,7 +18,7 @@ from inference import GenerativeInferenceModel, get_inference_configs
 # Parse command line arguments
 parser = argparse.ArgumentParser(description='Run Generative Inference Demo')
-parser.add_argument('--port', type=int, default=7860, help='Port to run the server on')
 args = parser.parse_args()
 # Create model directories if they don't exist
@@ -26,13 +29,54 @@ os.makedirs("stimuli", exist_ok=True)
 if "SPACE_ID" in os.environ:
     default_port = int(os.environ.get("PORT", 7860))
 else:
-    default_port = 8861  # Local default port
 # Initialize model
 model = GenerativeInferenceModel()
 # Define example images and their parameters with updated values from the research
 examples = [
     {
         "image": os.path.join("stimuli", "Neon_Color_Circle.jpg"),
         "name": "Neon Color Spreading",
@@ -91,25 +135,6 @@ examples = [
             "epsilon": 20.0
         }
     },
-    {
-        "image": os.path.join("stimuli", "face_vase.png"),
-        "name": "Rubin's Face-Vase (Object Prior)",
-        "wiki": "https://en.wikipedia.org/wiki/Rubin_vase",
-        "papers": [
-            "[Figure-Ground Perception](https://en.wikipedia.org/wiki/Figure-ground_(perception))",
-            "[Bistable Perception](https://doi.org/10.1016/j.tics.2003.08.003)"
-        ],
-        "method": "Prior-Guided Drift Diffusion",
-        "reverse_diff": {
-            "model": "resnet50_robust",
-            "layer": "avgpool",
-            "initial_noise": 0.9,
-            "diffusion_noise": 0.003,
-            "step_size": 0.58,
-            "iterations": 100,
-            "epsilon": 0.81
-        }
-    },
     {
         "image": os.path.join("stimuli", "Confetti_illusion.png"),
         "name": "Confetti Illusion",
@@ -223,21 +248,76 @@ def run_inference(image, model_type, inference_type, eps_value, num_iterations,
     # Create animation frames
     frames = []
     for i, step_image in enumerate(all_steps):
-        # Convert tensor to PIL image
-        step_pil = Image.fromarray((step_image.permute(1, 2, 0).cpu().numpy() * 255).astype(np.uint8))
-        frames.append(step_pil)
     # Convert the final output image to PIL
-    final_image = Image.fromarray((output_image.permute(1, 2, 0).cpu().numpy() * 255).astype(np.uint8))
     # Return the final inferred image and the animation frames directly
     return final_image, frames
 # Helper function to apply example parameters
 def apply_example(example):
     return [
-        example["image"],
-        "resnet50_robust",  # Model type
         example["method"],  # Inference type
         example["reverse_diff"]["epsilon"],  # Epsilon value
         example["reverse_diff"]["iterations"],  # Number of iterations
@@ -249,7 +329,7 @@ def apply_example(example):
     ]
 # Define the interface
-with gr.Blocks(title="Generative Inference Demo", css="""
 .purple-button {
     background-color: #8B5CF6 !important;
     color: white !important;
@@ -259,7 +339,7 @@ with gr.Blocks(title="Generative Inference Demo", css="""
     background-color: #7C3AED !important;
 }
 """) as demo:
-    gr.Markdown("# Generative Inference Demo")
     gr.Markdown("This demo showcases how neural networks can perceive visual illusions and develop Gestalt principles of perceptual organization through generative inference.")
     gr.Markdown("""
@@ -273,7 +353,9 @@ with gr.Blocks(title="Generative Inference Demo", css="""
     with gr.Row():
         with gr.Column(scale=1):
             # Inputs
-            image_input = gr.Image(label="Input Image", type="pil", value=os.path.join("stimuli", "Neon_Color_Circle.jpg"))
             # Run Inference button right below the image
             run_button = gr.Button("🪄 Run Generative Inference", variant="primary", elem_classes="purple-button")
@@ -286,7 +368,7 @@ with gr.Blocks(title="Generative Inference Demo", css="""
                 with gr.Row():
                     model_choice = gr.Dropdown(
                         choices=["resnet50_robust", "standard_resnet50", "resnet50_robust_face"], # "resnet50_robust_face" - hidden for deployment
-                        value="resnet50_robust",
                         label="Model"
                     )
@@ -297,21 +379,21 @@ with gr.Blocks(title="Generative Inference Demo", css="""
                     )
                 with gr.Row():
-                    eps_slider = gr.Slider(minimum=0.0, maximum=40.0, value=20.0, step=0.01, label="Epsilon (Stimulus Fidelity)")
-                    iterations_slider = gr.Slider(minimum=1, maximum=600, value=101, step=1, label="Number of Iterations")  # Updated max to 600
                 with gr.Row():
-                    initial_noise_slider = gr.Slider(minimum=0.0, maximum=1.0, value=0.8, step=0.01,
                                                    label="Drift Noise")
-                    diffusion_noise_slider = gr.Slider(minimum=0.0, maximum=0.05, value=0.003, step=0.001,
                                                     label="Diffusion Noise")  # Corrected name
                 with gr.Row():
-                    step_size_slider = gr.Slider(minimum=0.01, maximum=2.0, value=1.0, step=0.01,
                                                label="Update Rate")  # Added step size slider
                     layer_choice = gr.Dropdown(
                         choices=["all", "conv1", "bn1", "relu", "maxpool", "layer1", "layer2", "layer3", "layer4", "avgpool"],
-                        value="layer3",
                         label="Model Layer"
                     )
@@ -403,10 +485,10 @@ with gr.Blocks(title="Generative Inference Demo", css="""
 # Launch the demo
 if __name__ == "__main__":
-    print(f"Starting server on port {args.port}")
     demo.launch(
         server_name="0.0.0.0",
-        server_port=args.port,
         share=False,
         debug=True
     )

 from PIL import Image
 try:
     from spaces import GPU
+    print("Running on Hugging Face Spaces - GPU decorator available")
 except ImportError:
     # Define a no-op decorator if running locally
     def GPU(func):
+        """No-op decorator for local execution (GPU handling is automatic)"""
         return func
+    print("Running locally - GPU decorator not available (using automatic GPU detection)")
 import os
 import argparse
 # Parse command line arguments
 parser = argparse.ArgumentParser(description='Run Generative Inference Demo')
+parser.add_argument('--port', type=int, default=None, help='Port to run the server on')
 args = parser.parse_args()
 # Create model directories if they don't exist
 if "SPACE_ID" in os.environ:
     default_port = int(os.environ.get("PORT", 7860))
 else:
+    default_port = 7860  # Use same default port locally
+# Use command line port if provided, otherwise use default
+server_port = args.port if args.port is not None else default_port
 # Initialize model
 model = GenerativeInferenceModel()
 # Define example images and their parameters with updated values from the research
 examples = [
+    {
+        "image": os.path.join("stimuli", "face_vase.png"),
+        "name": "Rubin's Face-Vase (Object Prior)",
+        "wiki": "https://en.wikipedia.org/wiki/Rubin_vase",
+        "papers": [
+            "[Figure-Ground Perception](https://en.wikipedia.org/wiki/Figure-ground_(perception))",
+            "[Bistable Perception](https://doi.org/10.1016/j.tics.2003.08.003)"
+        ],
+        "method": "Prior-Guided Drift Diffusion",
+        "reverse_diff": {
+            "model": "resnet50_robust_face",
+            "layer": "layer4",
+            "initial_noise": 0.0,
+            "diffusion_noise": 0.006,
+            "step_size": 0.18,
+            "iterations": 100,
+            "epsilon": 9.53
+        }
+    },
+    {
+        "image": os.path.join("stimuli", "RandomizedPhaseOvalGray.png"),
+        "name": "Noise (Randomized Phase Oval)",
+        "wiki": "https://en.wikipedia.org/wiki/Visual_noise",
+        "papers": [
+            "[Perceptual Organization](https://doi.org/10.1016/j.tics.2003.08.003)",
+            "[Pattern Recognition](https://en.wikipedia.org/wiki/Pattern_recognition)"
+        ],
+        "method": "Prior-Guided Drift Diffusion",
+        "reverse_diff": {
+            "model": "resnet50_robust_face",
+            "layer": "all",
+            "initial_noise": 0.0,
+            "diffusion_noise": 0.05,
+            "step_size": 1.12,
+            "iterations": 428,
+            "epsilon": 198.62
+        }
+    },
     {
         "image": os.path.join("stimuli", "Neon_Color_Circle.jpg"),
         "name": "Neon Color Spreading",
             "epsilon": 20.0
         }
     },
     {
         "image": os.path.join("stimuli", "Confetti_illusion.png"),
         "name": "Confetti Illusion",
     # Create animation frames
     frames = []
     for i, step_image in enumerate(all_steps):
+        # Convert tensor to PIL image with proper error handling
+        try:
+            # Ensure tensor is on CPU and detached
+            if isinstance(step_image, torch.Tensor):
+                step_image = step_image.detach().cpu()
+                # Handle different tensor shapes
+                if len(step_image.shape) == 4:  # [B, C, H, W]
+                    step_image = step_image[0]  # Take first batch item
+                elif len(step_image.shape) == 3:  # [C, H, W]
+                    pass  # Already correct shape
+                else:
+                    raise ValueError(f"Unexpected tensor shape: {step_image.shape}")
+                # Clamp values to [0, 1] range before converting
+                step_image = torch.clamp(step_image, 0, 1)
+                # Convert to numpy and ensure contiguous array
+                step_np = step_image.permute(1, 2, 0).numpy()
+                # Ensure it's a contiguous array with correct dtype
+                step_np = np.ascontiguousarray(step_np, dtype=np.float32)
+                # Convert to uint8
+                step_np = (step_np * 255).astype(np.uint8)
+                # Create PIL image
+                step_pil = Image.fromarray(step_np, mode='RGB')
+                frames.append(step_pil)
+            else:
+                print(f"Warning: step_image at index {i} is not a tensor: {type(step_image)}")
+        except Exception as e:
+            print(f"Error converting step {i} to PIL image: {e}, shape: {step_image.shape if hasattr(step_image, 'shape') else 'N/A'}")
+            # Skip this frame if conversion fails
+            continue
     # Convert the final output image to PIL
+    try:
+        if isinstance(output_image, torch.Tensor):
+            output_image = output_image.detach().cpu()
+            # Handle different tensor shapes
+            if len(output_image.shape) == 4:  # [B, C, H, W]
+                output_image = output_image[0]  # Take first batch item
+            elif len(output_image.shape) == 3:  # [C, H, W]
+                pass  # Already correct shape
+            else:
+                raise ValueError(f"Unexpected tensor shape: {output_image.shape}")
+            # Clamp values to [0, 1] range before converting
+            output_image = torch.clamp(output_image, 0, 1)
+            # Convert to numpy and ensure contiguous array
+            output_np = output_image.permute(1, 2, 0).numpy()
+            # Ensure it's a contiguous array with correct dtype
+            output_np = np.ascontiguousarray(output_np, dtype=np.float32)
+            # Convert to uint8
+            output_np = (output_np * 255).astype(np.uint8)
+            # Create PIL image
+            final_image = Image.fromarray(output_np, mode='RGB')
+        else:
+            raise ValueError(f"output_image is not a tensor: {type(output_image)}")
+    except Exception as e:
+        print(f"Error converting final image to PIL: {e}, shape: {output_image.shape if hasattr(output_image, 'shape') else 'N/A'}")
+        # Return a black image as fallback
+        final_image = Image.new('RGB', (224, 224), color='black')
     # Return the final inferred image and the animation frames directly
     return final_image, frames
 # Helper function to apply example parameters
 def apply_example(example):
+    # Get the full path to the image file
+    image_path = os.path.abspath(example["image"]) if os.path.exists(example["image"]) else example["image"]
     return [
+        image_path,
+        example["reverse_diff"]["model"],  # Model type from example
         example["method"],  # Inference type
         example["reverse_diff"]["epsilon"],  # Epsilon value
         example["reverse_diff"]["iterations"],  # Number of iterations
     ]
 # Define the interface
+with gr.Blocks(title="Generative Inference for Psychiatry Demo", css="""
 .purple-button {
     background-color: #8B5CF6 !important;
     color: white !important;
     background-color: #7C3AED !important;
 }
 """) as demo:
+    gr.Markdown("# Generative Inference for Psychiatry Demo")
     gr.Markdown("This demo showcases how neural networks can perceive visual illusions and develop Gestalt principles of perceptual organization through generative inference.")
     gr.Markdown("""
     with gr.Row():
         with gr.Column(scale=1):
             # Inputs
+            # Use absolute path for default image to avoid directory errors
+            default_image_path = os.path.abspath(os.path.join("stimuli", "face_vase.png")) if os.path.exists(os.path.join("stimuli", "face_vase.png")) else None
+            image_input = gr.Image(label="Input Image", type="pil", value=default_image_path)
             # Run Inference button right below the image
             run_button = gr.Button("🪄 Run Generative Inference", variant="primary", elem_classes="purple-button")
                 with gr.Row():
                     model_choice = gr.Dropdown(
                         choices=["resnet50_robust", "standard_resnet50", "resnet50_robust_face"], # "resnet50_robust_face" - hidden for deployment
+                        value="resnet50_robust_face",
                         label="Model"
                     )
                     )
                 with gr.Row():
+                    eps_slider = gr.Slider(minimum=0.0, maximum=200.0, value=9.53, step=0.01, label="Epsilon (Stimulus Fidelity)")
+                    iterations_slider = gr.Slider(minimum=1, maximum=600, value=100, step=1, label="Number of Iterations")  # Updated max to 600
                 with gr.Row():
+                    initial_noise_slider = gr.Slider(minimum=0.0, maximum=5.0, value=0.0, step=0.01,
                                                    label="Drift Noise")
+                    diffusion_noise_slider = gr.Slider(minimum=0.0, maximum=1.0, value=0.006, step=0.001,
                                                     label="Diffusion Noise")  # Corrected name
                 with gr.Row():
+                    step_size_slider = gr.Slider(minimum=0.0, maximum=10.0, value=0.18, step=0.01,
                                                label="Update Rate")  # Added step size slider
                     layer_choice = gr.Dropdown(
                         choices=["all", "conv1", "bn1", "relu", "maxpool", "layer1", "layer2", "layer3", "layer4", "avgpool"],
+                        value="layer4",
                         label="Model Layer"
                     )
 # Launch the demo
 if __name__ == "__main__":
+    print(f"Starting server on port {server_port}")
     demo.launch(
         server_name="0.0.0.0",
+        server_port=server_port,
         share=False,
         debug=True
     )

face_vase_black.png ADDED Viewed

huggingface-metadata.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "title": "Generative Inference Demo",
   "emoji": "🧠",
   "colorFrom": "indigo",
   "colorTo": "purple",

 {
+  "title": "Generative Inference for Psychiatry Demo",
   "emoji": "🧠",
   "colorFrom": "indigo",
   "colorTo": "purple",

inference.py CHANGED Viewed

@@ -1,8 +1,10 @@
 import torch
 import torch.nn as nn
 import torch.nn.functional as F
-import torchvision.models as models
 import torchvision.transforms as transforms
 from torchvision.models.resnet import ResNet50_Weights
 from PIL import Image
 import numpy as np
@@ -12,6 +14,7 @@ import time
 import copy
 from collections import OrderedDict
 from pathlib import Path
 # Check for available hardware acceleration
 if torch.cuda.is_available():
@@ -22,175 +25,98 @@ else:
     device = torch.device("cpu")
 print(f"Using device: {device}")
-# Constants
 MODEL_URLS = {
     'resnet50_robust': 'https://huggingface.co/madrylab/robust-imagenet-models/resolve/main/resnet50_l2_eps3.ckpt',
     'resnet50_standard': 'https://huggingface.co/madrylab/robust-imagenet-models/resolve/main/resnet50_l2_eps0.ckpt',
     'resnet50_robust_face': 'https://huggingface.co/ttoosi/resnet50_robust_face/resolve/main/resnet50_imagenet_L2_eps_0.50_checkpoint150.pt'
 }
-# Per-model input size and normalization
-MODEL_PREPROC = {
-    "resnet50_robust":      {"size": 224, "mean": [0.485, 0.456, 0.406], "std": [0.229, 0.224, 0.225]},
-    "resnet50_standard":    {"size": 224, "mean": [0.485, 0.456, 0.406], "std": [0.229, 0.224, 0.225]},
-    # Typical for face models trained ArcFace/InsightFace-style
-    "resnet50_robust_face": {"size": 112, "mean": [0.5, 0.5, 0.5],      "std": [0.5, 0.5, 0.5]},
 }
 IMAGENET_MEAN = [0.485, 0.456, 0.406]
 IMAGENET_STD = [0.229, 0.224, 0.225]
-# Define the transforms based on whether normalization is on or off
-def get_transform(input_size=224, normalize=False, norm_mean=IMAGENET_MEAN, norm_std=IMAGENET_STD):
-    if normalize:
-        return transforms.Compose([
-            transforms.Resize(input_size),
-            transforms.CenterCrop(input_size),
-            transforms.ToTensor(),
-            transforms.Normalize(norm_mean, norm_std),
-        ])
     else:
-        return transforms.Compose([
-            transforms.Resize(input_size),
-            transforms.CenterCrop(input_size),
-            transforms.ToTensor(),
-        ])
-# Default transform without normalization
-transform = transforms.Compose([
-    transforms.Resize(224),
-    transforms.CenterCrop(224),
-    transforms.ToTensor(),
-])
-normalize_transform = transforms.Normalize(IMAGENET_MEAN, IMAGENET_STD)
-def extract_middle_layers(model, layer_index):
-    """
-    Extract a subset of the model up to a specific layer.
-    Args:
-        model: The neural network model
-        layer_index: String 'all' for the full model, or a layer identifier (string or int)
-                    For ResNet: integers 0-8 representing specific layers
-                    For ViT: strings like 'encoder.layers.encoder_layer_3'
-    Returns:
-        A modified model that outputs features from the specified layer
-    """
-    if isinstance(layer_index, str) and layer_index == 'all':
-        return model
-    # Special case for ViT's encoder layers with DataParallel wrapper
-    if isinstance(layer_index, str) and layer_index.startswith('encoder.layers.encoder_layer_'):
-        try:
-            target_layer_idx = int(layer_index.split('_')[-1])
-            # Create a deep copy of the model to avoid modifying the original
-            new_model = copy.deepcopy(model)
-            # For models wrapped in DataParallel
-            if hasattr(new_model, 'module'):
-                # Create a subset of encoder layers up to the specified index
-                encoder_layers = nn.Sequential()
-                for i in range(target_layer_idx + 1):
-                    layer_name = f"encoder_layer_{i}"
-                    if hasattr(new_model.module.encoder.layers, layer_name):
-                        encoder_layers.add_module(layer_name,
-                                               getattr(new_model.module.encoder.layers, layer_name))
-                # Replace the encoder layers with our truncated version
-                new_model.module.encoder.layers = encoder_layers
-                # Remove the heads since we're stopping at the encoder layer
-                new_model.module.heads = nn.Identity()
-                return new_model
-            else:
-                # Direct model access (not DataParallel)
-                encoder_layers = nn.Sequential()
-                for i in range(target_layer_idx + 1):
-                    layer_name = f"encoder_layer_{i}"
-                    if hasattr(new_model.encoder.layers, layer_name):
-                        encoder_layers.add_module(layer_name,
-                                               getattr(new_model.encoder.layers, layer_name))
-                # Replace the encoder layers with our truncated version
-                new_model.encoder.layers = encoder_layers
-                # Remove the heads since we're stopping at the encoder layer
-                new_model.heads = nn.Identity()
-                return new_model
-        except (ValueError, IndexError) as e:
-            raise ValueError(f"Invalid ViT layer specification: {layer_index}. Error: {e}")
-    # Handling for ViT whole blocks
-    elif hasattr(model, 'blocks') or (hasattr(model, 'module') and hasattr(model.module, 'blocks')):
-        # Check for DataParallel wrapper
-        base_model = model.module if hasattr(model, 'module') else model
-        # Create a deep copy to avoid modifying the original
-        new_model = copy.deepcopy(model)
-        base_new_model = new_model.module if hasattr(new_model, 'module') else new_model
-        # Add the desired number of transformer blocks
-        if isinstance(layer_index, int):
-            # Truncate the blocks
-            base_new_model.blocks = base_new_model.blocks[:layer_index+1]
-        return new_model
-    else:
-        # Original ResNet/VGG handling
-        modules = list(model.named_children())
-        print(f"DEBUG - extract_middle_layers - Looking for '{layer_index}' in {[name for name, _ in modules]}")
-        cutoff_idx = next((i for i, (name, _) in enumerate(modules)
-                          if name == str(layer_index)), None)
-        if cutoff_idx is not None:
-            # Keep modules up to and including the target
-            new_model = nn.Sequential(OrderedDict(modules[:cutoff_idx+1]))
-            return new_model
-        else:
-            raise ValueError(f"Module {layer_index} not found in model")
-# Get ImageNet labels
-def get_imagenet_labels():
-    url = "https://raw.githubusercontent.com/anishathalye/imagenet-simple-labels/master/imagenet-simple-labels.json"
-    response = requests.get(url)
-    if response.status_code == 200:
-        return response.json()
-    else:
-        raise RuntimeError("Failed to fetch ImageNet labels")
-# Download model if needed
-def download_model(model_type):
-    if model_type not in MODEL_URLS or MODEL_URLS[model_type] is None:
-        return None  # Use PyTorch's pretrained model
-    # Handle special case for face model
-    if model_type == 'resnet50_robust_face':
-        model_path = Path("models/resnet50_robust_face_100_checkpoint.pt")
-    else:
-        model_path = Path(f"models/{model_type}.pt")
-    if not model_path.exists():
-        print(f"Downloading {model_type} model...")
-        url = MODEL_URLS[model_type]
-        response = requests.get(url, stream=True)
-        if response.status_code == 200:
-            with open(model_path, 'wb') as f:
-                for chunk in response.iter_content(chunk_size=8192):
-                    f.write(chunk)
-            print(f"Model downloaded and saved to {model_path}")
-        else:
-            raise RuntimeError(f"Failed to download model: {response.status_code}")
-    return model_path
 class NormalizeByChannelMeanStd(nn.Module):
     def __init__(self, mean, std):
         super(NormalizeByChannelMeanStd, self).__init__()
         if not isinstance(mean, torch.Tensor):
@@ -205,737 +131,567 @@ class NormalizeByChannelMeanStd(nn.Module):
     def normalize_fn(self, tensor, mean, std):
         """Differentiable version of torchvision.functional.normalize"""
-        # here we assume the color channel is at dim=1
         mean = mean[None, :, None, None]
         std = std[None, :, None, None]
         return tensor.sub(mean).div(std)
 class InferStep:
-    def __init__(self, orig_image, eps, step_size):
         self.orig_image = orig_image
         self.eps = eps
         self.step_size = step_size
-    def project(self, x):
         diff = x - self.orig_image
         diff = torch.clamp(diff, -self.eps, self.eps)
         return torch.clamp(self.orig_image + diff, 0, 1)
-    def step(self, x, grad):
-        l = len(x.shape) - 1
-        grad_norm = torch.norm(grad.view(grad.shape[0], -1), dim=1).view(-1, *([1]*l))
         scaled_grad = grad / (grad_norm + 1e-10)
         return scaled_grad * self.step_size
-def get_iterations_to_show(n_itr):
-    """Generate a dynamic list of iterations to show based on total iterations."""
-    if n_itr <= 50:
-        return [1, 5, 10, 20, 30, 40, 50, n_itr]
-    elif n_itr <= 100:
-        return [1, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, n_itr]
-    elif n_itr <= 200:
-        return [1, 5, 10, 20, 30, 40, 50, 75, 100, 125, 150, 175, 200, n_itr]
-    elif n_itr <= 500:
-        return [1, 5, 10, 20, 30, 40, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, n_itr]
-    else:
-        # For very large iterations, show more evenly distributed points
-        return [1, 5, 10, 20, 30, 40, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500,
-                int(n_itr*0.6), int(n_itr*0.7), int(n_itr*0.8), int(n_itr*0.9), n_itr]
-def get_inference_configs(inference_type='IncreaseConfidence', eps=0.5, n_itr=50, step_size=1.0):
-    """Generate inference configuration with customizable parameters.
-    Args:
-        inference_type (str): Type of inference ('IncreaseConfidence' or 'Prior-Guided Drift Diffusion')
-        eps (float): Maximum perturbation size
-        n_itr (int): Number of iterations
-        step_size (float): Step size for each iteration
-    """
-    # Base configuration common to all inference types
-    config = {
-        'loss_infer': inference_type,  # How to guide the optimization
-        'n_itr': n_itr,  # Number of iterations
-        'eps': eps,  # Maximum perturbation size
-        'step_size': step_size,  # Step size for each iteration
-        'diffusion_noise_ratio': 0.0,  # No diffusion noise
-        'initial_inference_noise_ratio': 0.0,  # No initial noise
-        'top_layer': 'all',  # Use all layers of the model
-        'inference_normalization': False,  # Apply normalization during inference
-        'recognition_normalization': False,  # Apply normalization during recognition
-        'iterations_to_show': get_iterations_to_show(n_itr),  # Dynamic iterations to visualize
-        'misc_info': {'keep_grads': False}  # Additional configuration
-    }
-    # Customize based on inference type
-    if inference_type == 'IncreaseConfidence':
-        config['loss_function'] = 'CE'  # Cross Entropy
-    elif inference_type == 'Prior-Guided Drift Diffusion':
-        config['loss_function'] = 'MSE'  # Mean Square Error
-        config['initial_inference_noise_ratio'] = 0.05  # Initial noise for diffusion
-        config['diffusion_noise_ratio'] = 0.01  # Add noise during diffusion
-    elif inference_type == 'GradModulation':
-        config['loss_function'] = 'CE'  # Cross Entropy
-        config['misc_info']['grad_modulation'] = 0.5  # Gradient modulation strength
-    elif inference_type == 'CompositionalFusion':
-        config['loss_function'] = 'CE'  # Cross Entropy
-        config['misc_info']['positive_classes'] = []  # Classes to maximize
-        config['misc_info']['negative_classes'] = []  # Classes to minimize
-    return config
 class GenerativeInferenceModel:
     def __init__(self):
         self.models = {}
-        #self.normalizer = NormalizeByChannelMeanStd(IMAGENET_MEAN, IMAGENET_STD).to(device)
         self.model_preproc = {}
-        self.labels = get_imagenet_labels()
-    def verify_model_integrity(self, model, model_type):
-        """
-        Verify model integrity by running a test input through it.
-        Returns whether the model passes basic integrity check.
-        """
         try:
-            print(f"\n=== Running model integrity check for {model_type} ===")
-            # Create a deterministic test input directly on the correct device
-            H = W = MODEL_PREPROC.get(model_type, {"size": 224})["size"]
-            test_input = torch.zeros(1, 3, H, W, device=device)
-            test_input[0, 0, 100:124, 100:124] = 0.5  # Red square
-            # Run forward pass
-            with torch.no_grad():
-                output = model(test_input)
-            # Check output shape
-            if output.shape != (1, 1000):
-                print(f"❌ Unexpected output shape: {output.shape}, expected (1, 1000)")
-                return False
-            # Get top prediction
-            probs = torch.nn.functional.softmax(output, dim=1)
-            confidence, prediction = torch.max(probs, 1)
-            # Calculate basic statistics on output
-            mean = output.mean().item()
-            std = output.std().item()
-            min_val = output.min().item()
-            max_val = output.max().item()
-            print(f"Model integrity check results:")
-            print(f"- Output shape: {output.shape}")
-            print(f"- Top prediction: Class {prediction.item()} with {confidence.item()*100:.2f}% confidence")
-            print(f"- Output statistics: mean={mean:.3f}, std={std:.3f}, min={min_val:.3f}, max={max_val:.3f}")
-            # Basic sanity checks
-            if torch.isnan(output).any():
-                print("❌ Model produced NaN outputs")
-                return False
-            if output.std().item() < 0.1:
-                print("⚠️ Low output variance, model may not be discriminative")
-            print("✅ Model passes basic integrity check")
-            return True
         except Exception as e:
-            print(f"❌ Model integrity check failed with error: {e}")
-            # Rather than failing completely, we'll continue
-            return True
     def load_model(self, model_type):
         if model_type in self.models:
             print(f"Using cached {model_type} model")
             return self.models[model_type]
         start_time = time.time()
-        model_path = download_model(model_type)
-        # pick preproc for this model
-        pre = MODEL_PREPROC.get(model_type, {"size": 224, "mean": IMAGENET_MEAN, "std": IMAGENET_STD})
-        normalizer = NormalizeByChannelMeanStd(pre["mean"], pre["std"]).to(device)
-        self.model_preproc[model_type] = pre
-        resnet = models.resnet50()
         model = nn.Sequential(normalizer, resnet)
-        # Load the model checkpoint
         if model_path:
             print(f"Loading {model_type} model from {model_path}...")
             try:
                 checkpoint = torch.load(model_path, map_location=device)
-                # Print checkpoint structure for better understanding
-                print("\n=== Analyzing checkpoint structure ===")
-                if isinstance(checkpoint, dict):
-                    print(f"Checkpoint contains keys: {list(checkpoint.keys())}")
-                    # Examine 'model' structure if it exists
-                    if 'model' in checkpoint and isinstance(checkpoint['model'], dict):
-                        model_dict = checkpoint['model']
-                        # Get sample of keys to understand structure
-                        first_keys = list(model_dict.keys())[:5]
-                        print(f"'model' contains keys like: {first_keys}")
-                        # Check for common prefixes in the model dict
-                        prefixes = set()
-                        for key in list(model_dict.keys())[:100]:  # Check first 100 keys
-                            parts = key.split('.')
-                            if len(parts) > 1:
-                                prefixes.add(parts[0])
-                        if prefixes:
-                            print(f"Common prefixes in model dict: {prefixes}")
-                else:
-                    print(f"Checkpoint is not a dictionary, but a {type(checkpoint)}")
                 # Handle different checkpoint formats
                 if 'model' in checkpoint:
-                    # Format from madrylab robust models
                     state_dict = checkpoint['model']
                     print("Using 'model' key from checkpoint")
                 elif 'state_dict' in checkpoint:
                     state_dict = checkpoint['state_dict']
                     print("Using 'state_dict' key from checkpoint")
                 else:
-                    # Direct state dict
                     state_dict = checkpoint
                     print("Using checkpoint directly as state_dict")
-                # Handle prefix in state dict keys for ResNet part
                 resnet_state_dict = {}
-                prefixes_to_try = ['', 'module.', 'model.', 'attacker.model.']
                 resnet_keys = set(resnet.state_dict().keys())
-                # First check if we can find keys directly in the attacker.model path
-                print("\n=== Phase 1: Checking for specific model structures ===")
-                # Check for 'module.model' structure (seen in actual checkpoint)
-                module_model_keys = [key for key in state_dict.keys() if key.startswith('module.model.')]
-                if module_model_keys:
-                    print(f"Found 'module.model' structure with {len(module_model_keys)} parameters")
-                    # Extract all parameters from module.model
-                    for source_key, value in state_dict.items():
-                        if source_key.startswith('module.model.'):
-                            target_key = source_key[len('module.model.'):]
-                            # Some ckpts have 'module.model.model.<...>'; remove the extra 'model.' too
-                            if target_key.startswith('model.'):
-                                target_key = target_key[len('model.'):]
-                            resnet_state_dict[target_key] = value
-                    print(f"Extracted {len(resnet_state_dict)} parameters from module.model")
-                # Check for 'attacker.model' structure
-                attacker_model_keys = [key for key in state_dict.keys() if key.startswith('attacker.model.')]
-                if attacker_model_keys:
-                    print(f"Found 'attacker.model' structure with {len(attacker_model_keys)} parameters")
-                    # Extract all parameters from attacker.model
-                    for source_key, value in state_dict.items():
-                        if source_key.startswith('attacker.model.'):
-                            target_key = source_key[len('attacker.model.'):]
-                            resnet_state_dict[target_key] = value
-                    print(f"Extracted {len(resnet_state_dict)} parameters from attacker.model")
-                    # Check if 'model' (not attacker.model) exists as a fallback
-                    model_keys = [key for key in state_dict.keys() if key.startswith('model.') and not key.startswith('attacker.model.')]
-                    if model_keys and len(resnet_state_dict) < len(resnet_keys):
-                        print(f"Found additional 'model.' structure with {len(model_keys)} parameters")
-                        # Try to complete missing parameters
                         for source_key, value in state_dict.items():
-                            if source_key.startswith('model.'):
-                                target_key = source_key[len('model.'):]
-                                if target_key in resnet_keys and target_key not in resnet_state_dict:
                                     resnet_state_dict[target_key] = value
-                else:
-                    # Check for other known structures
-                    structure_found = False
-                    # Check for 'model.' prefix
-                    model_keys = [key for key in state_dict.keys() if key.startswith('model.')]
-                    if model_keys:
-                        print(f"Found 'model.' structure with {len(model_keys)} parameters")
-                        for source_key, value in state_dict.items():
-                            if source_key.startswith('model.'):
-                                target_key = source_key[len('model.'):]
-                                resnet_state_dict[target_key] = value
-                        structure_found = True
-                    # Check for ResNet parameters at the top level
-                    top_level_resnet_keys = 0
-                    for key in resnet_keys:
-                        if key in state_dict:
-                            top_level_resnet_keys += 1
-                    if top_level_resnet_keys > 0:
-                        print(f"Found {top_level_resnet_keys} ResNet parameters at top level")
-                        for target_key in resnet_keys:
-                            if target_key in state_dict:
-                                resnet_state_dict[target_key] = state_dict[target_key]
-                        structure_found = True
-                    # If no structure was recognized, try the prefix mapping approach
-                    if not structure_found:
-                        print("No standard model structure found, trying prefix mappings...")
-                        for target_key in resnet_keys:
-                            for prefix in prefixes_to_try:
-                                source_key = prefix + target_key
-                                if source_key in state_dict:
-                                    resnet_state_dict[target_key] = state_dict[source_key]
-                                    break
-                # If we still can't find enough keys, try a final approach of removing prefixes
-                if len(resnet_state_dict) < len(resnet_keys):
-                    print(f"Found only {len(resnet_state_dict)}/{len(resnet_keys)} parameters, trying prefix removal...")
-                    # Track matches found through prefix removal
-                    prefix_matches = {prefix: 0 for prefix in ['module.', 'model.', 'attacker.model.', 'attacker.']}
-                    layer_matches = {}  # Track matches by layer type
-                    # Count parameter keys by layer type for analysis
-                    for key in resnet_keys:
-                        layer_name = key.split('.')[0] if '.' in key else key
-                        if layer_name not in layer_matches:
-                            layer_matches[layer_name] = {'total': 0, 'matched': 0}
-                        layer_matches[layer_name]['total'] += 1
-                    # Try keys with common prefixes
                     for source_key, value in state_dict.items():
-                        # Skip if already found
                         target_key = source_key
-                        matched_prefix = None
                         # Try removing various prefixes
-                        for prefix in ['module.', 'model.', 'attacker.model.', 'attacker.']:
                             if source_key.startswith(prefix):
                                 target_key = source_key[len(prefix):]
-                                matched_prefix = prefix
                                 break
-                        # If the target key is in the ResNet keys, add it to the state dict
-                        if target_key in resnet_keys and target_key not in resnet_state_dict:
                             resnet_state_dict[target_key] = value
-                            # Update match statistics
-                            if matched_prefix:
-                                prefix_matches[matched_prefix] += 1
-                            # Update layer matches
-                            layer_name = target_key.split('.')[0] if '.' in target_key else target_key
-                            if layer_name in layer_matches:
-                                layer_matches[layer_name]['matched'] += 1
-                    # Print detailed prefix removal statistics
-                    print("\n=== Prefix Removal Statistics ===")
-                    total_matches = sum(prefix_matches.values())
-                    print(f"Total parameters matched through prefix removal: {total_matches}/{len(resnet_keys)} ({(total_matches/len(resnet_keys))*100:.1f}%)")
-                    # Show matches by prefix
-                    print("\nMatches by prefix:")
-                    for prefix, count in sorted(prefix_matches.items(), key=lambda x: x[1], reverse=True):
-                        if count > 0:
-                            print(f"  {prefix}: {count} parameters")
-                    # Show matches by layer type
-                    print("\nMatches by layer type:")
-                    for layer, stats in sorted(layer_matches.items(), key=lambda x: x[1]['total'], reverse=True):
-                        match_percent = (stats['matched'] / stats['total']) * 100 if stats['total'] > 0 else 0
-                        print(f"  {layer}: {stats['matched']}/{stats['total']} ({match_percent:.1f}%)")
-                    # Check for specific important layers (conv1, layer1, etc.)
-                    critical_layers = ['conv1', 'bn1', 'layer1', 'layer2', 'layer3', 'layer4', 'fc']
-                    print("\nStatus of critical layers:")
-                    for layer in critical_layers:
-                        if layer in layer_matches:
-                            match_percent = (layer_matches[layer]['matched'] / layer_matches[layer]['total']) * 100
-                            status = "✅ COMPLETE" if layer_matches[layer]['matched'] == layer_matches[layer]['total'] else "⚠️ INCOMPLETE"
-                            print(f"  {layer}: {layer_matches[layer]['matched']}/{layer_matches[layer]['total']} ({match_percent:.1f}%) - {status}")
-                        else:
-                            print(f"  {layer}: Not found in model")
-                # Load the ResNet state dict
                 if resnet_state_dict:
-                    try:
-                        # Use strict=False to allow missing keys
-                        result = resnet.load_state_dict(resnet_state_dict, strict=False)
-                        missing_keys, unexpected_keys = result
-                        # Generate detailed information with better formatting
-                        loading_report = []
-                        loading_report.append(f"\n===== MODEL LOADING REPORT: {model_type} =====")
-                        loading_report.append(f"Total parameters in checkpoint: {len(resnet_state_dict):,}")
-                        loading_report.append(f"Total parameters in model: {len(resnet.state_dict()):,}")
-                        loading_report.append(f"Missing keys: {len(missing_keys):,} parameters")
-                        loading_report.append(f"Unexpected keys: {len(unexpected_keys):,} parameters")
-                        # Calculate percentage of parameters loaded
-                        loaded_keys = set(resnet_state_dict.keys()) - set(unexpected_keys)
-                        loaded_percent = (len(loaded_keys) / len(resnet.state_dict())) * 100
-                        # Determine loading success status
-                        if loaded_percent >= 99.5:
-                            status = "✅ COMPLETE - All important parameters loaded"
-                        elif loaded_percent >= 90:
-                            status = "🟡 PARTIAL - Most parameters loaded, should still function"
-                        elif loaded_percent >= 50:
-                            status = "⚠️ INCOMPLETE - Many parameters missing, may not function properly"
-                        else:
-                            status = "❌ FAILED - Critical parameters missing, will not function properly"
-                        loading_report.append(f"Successfully loaded: {len(loaded_keys):,} parameters ({loaded_percent:.1f}%)")
-                        loading_report.append(f"Loading status: {status}")
-                        # If loading is severely incomplete, fall back to PyTorch's pretrained model
-                        if loaded_percent < 50:
-                            loading_report.append("\n⚠️ WARNING: Loading from checkpoint is too incomplete.")
-                            loading_report.append("⚠️ Falling back to PyTorch's pretrained model to avoid broken inference.")
-                            # Create a new ResNet model with pretrained weights
                             resnet = models.resnet50(weights=ResNet50_Weights.IMAGENET1K_V1)
                             model = nn.Sequential(normalizer, resnet)
-                            loading_report.append("✅ Successfully loaded PyTorch's pretrained ResNet50 model")
-                        # Show missing keys by layer type
-                        if missing_keys:
-                            loading_report.append("\nMissing keys by layer type:")
-                            layer_types = {}
-                            for key in missing_keys:
-                                # Extract layer type (e.g., 'conv', 'bn', 'layer1', etc.)
-                                parts = key.split('.')
-                                if len(parts) > 0:
-                                    layer_type = parts[0]
-                                    if layer_type not in layer_types:
-                                        layer_types[layer_type] = 0
-                                    layer_types[layer_type] += 1
-                            # Add counts by layer type
-                            for layer_type, count in sorted(layer_types.items(), key=lambda x: x[1], reverse=True):
-                                loading_report.append(f"  {layer_type}: {count:,} parameters")
-                            loading_report.append("\nFirst 10 missing keys:")
-                            for i, key in enumerate(sorted(missing_keys)[:10]):
-                                loading_report.append(f"  {i+1}. {key}")
-                        # Show unexpected keys if any
-                        if unexpected_keys:
-                            loading_report.append("\nFirst 10 unexpected keys:")
-                            for i, key in enumerate(sorted(unexpected_keys)[:10]):
-                                loading_report.append(f"  {i+1}. {key}")
-                        loading_report.append("========================================")
-                        # Convert report to string and print it
-                        report_text = "\n".join(loading_report)
-                        print(report_text)
-                        # Also save to a file for reference
-                        os.makedirs("logs", exist_ok=True)
-                        with open(f"logs/model_loading_{model_type}.log", "w") as f:
-                            f.write(report_text)
-                        # Look for normalizer parameters as well
-                        if any(key.startswith('attacker.normalize.') for key in state_dict.keys()):
-                            norm_state_dict = {}
-                            for key, value in state_dict.items():
-                                if key.startswith('attacker.normalize.'):
-                                    norm_key = key[len('attacker.normalize.'):]
-                                    norm_state_dict[norm_key] = value
-                            if norm_state_dict:
-                                try:
-                                    normalizer.load_state_dict(norm_state_dict, strict=False)
-                                    print("Successfully loaded normalizer parameters")
-                                except Exception as e:
-                                    print(f"Warning: Could not load normalizer parameters: {e}")
-                    except Exception as e:
-                        print(f"Warning: Error loading ResNet parameters: {e}")
-                        # Fall back to loading without normalizer
-                        model = resnet  # Use just the ResNet model without normalizer
             except Exception as e:
-                print(f"Error loading model checkpoint: {e}")
-                # Fallback to PyTorch's pretrained model
-                print("Falling back to PyTorch's pretrained model")
                 resnet = models.resnet50(weights=ResNet50_Weights.IMAGENET1K_V1)
                 model = nn.Sequential(normalizer, resnet)
-        else:
-            # Fallback to PyTorch's pretrained model
-            print("No checkpoint available, using PyTorch's pretrained model")
-            resnet = models.resnet50(weights=ResNet50_Weights.IMAGENET1K_V1)
-            model = nn.Sequential(normalizer, resnet)
         model = model.to(device)
-        model.eval()  # Set to evaluation mode
-        # Verify model integrity
         self.verify_model_integrity(model, model_type)
-        # Store the model for future use
         self.models[model_type] = model
         end_time = time.time()
-        load_time = end_time - start_time
-        print(f"Model {model_type} loaded in {load_time:.2f} seconds")
         return model
     def inference(self, image, model_type, config):
-        """Run generative inference on the image."""
-        # Time the entire inference process
         inference_start = time.time()
-        # Load model if not already loaded
         model = self.load_model(model_type)
-        # Check if image is a file path
         if isinstance(image, str):
             if os.path.exists(image):
                 image = Image.open(image).convert('RGB')
             else:
                 raise ValueError(f"Image path does not exist: {image}")
-        elif isinstance(image, torch.Tensor):
-            raise ValueError(f"Image type {type(image)}, looks like already a transformed tensor")
-        # Prepare image tensor - match original code's conditional transform
-        load_start = time.time()
-        # Pick the right preproc for this model
-        pre = self.model_preproc.get(model_type, {"size": 224, "mean": IMAGENET_MEAN, "std": IMAGENET_STD})
-        # IMPORTANT: the model already includes a NormalizeByChannelMeanStd as layer 0,
-        # so do NOT normalize again here, or you’ll double-normalize.
-        custom_transform = get_transform(
-            input_size=pre["size"],   # 112 for resnet50_robust_face
-            normalize=False,          # leave False; model handles normalization internally
-            norm_mean=pre["mean"],
-            norm_std=pre["std"]
-        )
-        print(f"[PREPROC] {model_type}: size={pre['size']} mean={pre['mean']} std={pre['std']} (transform normalize=False; model has internal normalizer)")
-        # Special handling for GradModulation as in original
-        if config['loss_infer'] == 'GradModulation' and 'misc_info' in config and 'grad_modulation' in config['misc_info']:
-            grad_modulation = config['misc_info']['grad_modulation']
-            image_tensor = custom_transform(image).unsqueeze(0).to(device)
-            image_tensor = image_tensor * (1-grad_modulation) + grad_modulation * torch.randn_like(image_tensor).to(device)
         else:
-            image_tensor = custom_transform(image).unsqueeze(0).to(device)
-        image_tensor.requires_grad = True
-        print(f"Image loaded and processed in {time.time() - load_start:.2f} seconds")
-        # Check model structure
-        is_sequential = isinstance(model, nn.Sequential)
         # Get original predictions
         with torch.no_grad():
-            # If the model is sequential with a normalizer, skip the normalization step
-            if is_sequential and isinstance(model[0], NormalizeByChannelMeanStd):
-                print("Model is sequential with normalization")
-                # Get the core model part (typically at index 1 in Sequential)
-                core_model = model[1]
-                if config['inference_normalization']:
-                    output_original = model(image_tensor)  # Model includes normalization
-                else:
-                    output_original = core_model(image_tensor)  # Model includes normalization
             else:
-                print("Model is not sequential with normalization")
-                # Use manual normalization for non-sequential models
-                if config['inference_normalization']:
-                    normalized_tensor = normalize_transform(image_tensor)
-                    output_original = model(normalized_tensor)
-                else:
-                    output_original = model(image_tensor)
-                core_model = model
             probs_orig = F.softmax(output_original, dim=1)
             conf_orig, classes_orig = torch.max(probs_orig, 1)
-            # Get least confident classes
-            _, least_confident_classes = torch.topk(probs_orig, k=100, largest=False)
         # Initialize inference step
         infer_step = InferStep(image_tensor, config['eps'], config['step_size'])
         # Storage for inference steps
-        # Create a new tensor that requires gradients
         x = image_tensor.clone().detach().requires_grad_(True)
         all_steps = [image_tensor[0].detach().cpu()]
-        # For Prior-Guided Drift Diffusion, extract selected layer and initialize with noisy features
-        noisy_features = None
-        layer_model = None
-        if config['loss_infer'] == 'Prior-Guided Drift Diffusion':
-            print(f"Setting up Prior-Guided Drift Diffusion with layer {config['top_layer']} and noise {config['initial_inference_noise_ratio']}...")
-            # Extract model up to the specified layer
-            try:
-                # Start by finding the actual model to use
-                base_model = model
-                # Handle DataParallel wrapper if present
-                if hasattr(base_model, 'module'):
-                    base_model = base_model.module
-                # Log the initial model structure
-                print(f"DEBUG - Initial model structure: {type(base_model)}")
-                # If we have a Sequential model (which is likely our normalizer + model structure)
-                if isinstance(base_model, nn.Sequential):
-                    print(f"DEBUG - Sequential model with {len(list(base_model.children()))} children")
-                    # If this is our NormalizeByChannelMeanStd + ResNet pattern
-                    if len(list(base_model.children())) >= 2:
-                        # The actual ResNet model is the second component (index 1)
-                        actual_model = list(base_model.children())[1]
-                        print(f"DEBUG - Using ResNet component: {type(actual_model)}")
-                        print(f"DEBUG - Available layers: {[name for name, _ in actual_model.named_children()]}")
-                        # Extract from the actual ResNet
-                        layer_model = extract_middle_layers(actual_model, config['top_layer'])
-                    else:
-                        # Just a single component Sequential
-                        layer_model = extract_middle_layers(base_model, config['top_layer'])
-                else:
-                    # Not Sequential, might be direct model
-                    print(f"DEBUG - Available layers: {[name for name, _ in base_model.named_children()]}")
-                    layer_model = extract_middle_layers(base_model, config['top_layer'])
-                print(f"Successfully extracted model up to layer: {config['top_layer']}")
-            except ValueError as e:
-                print(f"Layer extraction failed: {e}. Using full model.")
-                layer_model = model
-            # Add noise to the image - exactly match original code
-            added_noise = config['initial_inference_noise_ratio'] * torch.randn_like(image_tensor).to(device)
-            noisy_image_tensor = image_tensor + added_noise
-            # Compute noisy features - simplified to match original code
-            noisy_features = layer_model(noisy_image_tensor)
-            print(f"Noisy features computed for Prior-Guided Drift Diffusion target with shape: {noisy_features.shape if hasattr(noisy_features, 'shape') else 'unknown'}")
         # Main inference loop
         print(f"Starting inference loop with {config['n_itr']} iterations for {config['loss_infer']}...")
-        loop_start = time.time()
         for i in range(config['n_itr']):
             # Reset gradients
             x.grad = None
-            # Forward pass - use layer_model for Prior-Guided Drift Diffusion, full model otherwise
-            if config['loss_infer'] == 'Prior-Guided Drift Diffusion' and layer_model is not None:
-                # Use the extracted layer model for Prior-Guided Drift Diffusion
-                # In original code, normalization is handled at transform time, not during forward pass
-                output = layer_model(x)
-            else:
-                # Standard forward pass with full model
-                # Simplified to match original code's approach
-                output = model(x)
-            # Calculate loss and gradients based on inference type
-            try:
-                if config['loss_infer'] == 'Prior-Guided Drift Diffusion':
-                    # Use MSE loss to match the noisy features
-                    assert config['loss_function'] == 'MSE', "Reverse Diffusion loss function must be MSE"
-                    if noisy_features is not None:
-                        loss = F.mse_loss(output, noisy_features)
-                        grad = torch.autograd.grad(loss, x)[0]  # Removed retain_graph=True to match original
-                    else:
-                        raise ValueError("Noisy features not computed for Prior-Guided Drift Diffusion")
-                else:  # Default 'IncreaseConfidence' approach
-                    # Get the least confident classes
-                    num_classes = min(10, least_confident_classes.size(1))
-                    target_classes = least_confident_classes[0, :num_classes]
-                    # Create targets for least confident classes
-                    targets = torch.tensor([idx.item() for idx in target_classes], device=device)
-                    # Use a combined loss to increase confidence
-                    loss = 0
-                    for target in targets:
-                        # Create one-hot target
-                        one_hot = torch.zeros_like(output)
-                        one_hot[0, target] = 1
-                        # Use loss to maximize confidence
-                        loss = loss + F.mse_loss(F.softmax(output, dim=1), one_hot)
-                    grad = torch.autograd.grad(loss, x, retain_graph=True)[0]
-                if grad is None:
-                    print("Warning: Direct gradient calculation failed")
-                    # Fall back to random perturbation
-                    random_noise = (torch.rand_like(x) - 0.5) * 2 * config['step_size']
-                    x = infer_step.project(x + random_noise)
                 else:
-                    # Update image with gradient - do this exactly as in original code
-                    adjusted_grad = infer_step.step(x, grad)
-                    # Add diffusion noise if specified
-                    diffusion_noise = config['diffusion_noise_ratio'] * torch.randn_like(x).to(device)
-                    # Apply gradient and noise in one operation before projecting, exactly as in original
-                    x = infer_step.project(x.clone() + adjusted_grad + diffusion_noise)
-            except Exception as e:
-                print(f"Error in gradient calculation: {e}")
-                # Fall back to random perturbation - match original code
-                random_noise = (torch.rand_like(x) - 0.5) * 2 * config['step_size']
-                x = infer_step.project(x.clone() + random_noise)
             # Store step if in iterations_to_show
-            if i+1 in config['iterations_to_show'] or i+1 == config['n_itr']:
                 all_steps.append(x[0].detach().cpu())
-        # Print some info about the inference
         with torch.no_grad():
-            if is_sequential and isinstance(model[0], NormalizeByChannelMeanStd):
-                if config['inference_normalization']:
-                    final_output = model(x)
-                else:
-                    final_output = core_model(x)
             else:
-                if config['inference_normalization']:
-                    normalized_x = normalize_transform(x)
-                    final_output = model(normalized_x)
-                else:
-                    final_output = model(x)
             final_probs = F.softmax(final_output, dim=1)
             final_conf, final_classes = torch.max(final_probs, 1)
-            # Calculate timing information
-            loop_time = time.time() - loop_start
             total_time = time.time() - inference_start
-            avg_iter_time = loop_time / config['n_itr'] if config['n_itr'] > 0 else 0
             print(f"Original top class: {classes_orig.item()} ({conf_orig.item():.4f})")
             print(f"Final top class: {final_classes.item()} ({final_conf.item():.4f})")
-            print(f"Inference loop completed in {loop_time:.2f} seconds ({avg_iter_time:.4f} sec/iteration)")
             print(f"Total inference time: {total_time:.2f} seconds")
-        # Return results in format compatible with both old and new code
         return {
             'final_image': x[0].detach().cpu(),
             'steps': all_steps,
             'original_class': classes_orig.item(),
             'original_confidence': conf_orig.item(),
             'final_class': final_classes.item(),
-            'final_confidence': final_conf.item()
         }
-# Utility function to show inference steps
 def show_inference_steps(steps, figsize=(15, 10)):
-    import matplotlib.pyplot as plt
-    n_steps = len(steps)
-    fig, axes = plt.subplots(1, n_steps, figsize=figsize)
-    for i, step_img in enumerate(steps):
-        img = step_img.permute(1, 2, 0).numpy()
-        axes[i].imshow(img)
-        axes[i].set_title(f"Step {i}")
-        axes[i].axis('off')
-    plt.tight_layout()
-    return fig

+"""Complete generative inference module with model loading and inference capabilities."""
 import torch
 import torch.nn as nn
 import torch.nn.functional as F
 import torchvision.transforms as transforms
+import torchvision.models as models
 from torchvision.models.resnet import ResNet50_Weights
 from PIL import Image
 import numpy as np
 import copy
 from collections import OrderedDict
 from pathlib import Path
+from typing import Dict, List, Optional, Tuple, Union
 # Check for available hardware acceleration
 if torch.cuda.is_available():
     device = torch.device("cpu")
 print(f"Using device: {device}")
+# Constants for model URLs
 MODEL_URLS = {
     'resnet50_robust': 'https://huggingface.co/madrylab/robust-imagenet-models/resolve/main/resnet50_l2_eps3.ckpt',
     'resnet50_standard': 'https://huggingface.co/madrylab/robust-imagenet-models/resolve/main/resnet50_l2_eps0.ckpt',
     'resnet50_robust_face': 'https://huggingface.co/ttoosi/resnet50_robust_face/resolve/main/resnet50_imagenet_L2_eps_0.50_checkpoint150.pt'
 }
+# Model-specific preprocessing configurations
+MODEL_CONFIGS = {
+    'resnet50_robust_face': {
+        'input_size': 112,
+        'norm_mean': [0.5, 0.5, 0.5],
+        'norm_std': [0.5, 0.5, 0.5],
+        'n_classes': 500,
+        'dataset': 'VGGFace2'
+    },
+    'resnet50_standard': {
+        'input_size': 224,
+        'norm_mean': [0.485, 0.456, 0.406],
+        'norm_std': [0.229, 0.224, 0.225],
+        'n_classes': 1000,
+        'dataset': 'ImageNet'
+    },
+    'resnet50_robust': {
+        'input_size': 224,
+        'norm_mean': [0.485, 0.456, 0.406],
+        'norm_std': [0.229, 0.224, 0.225],
+        'n_classes': 1000,
+        'dataset': 'ImageNet'
+    }
 }
 IMAGENET_MEAN = [0.485, 0.456, 0.406]
 IMAGENET_STD = [0.229, 0.224, 0.225]
+def get_iterations_to_show(n_itr):
+    """Generate a dynamic list of iterations to show based on total iterations."""
+    if n_itr <= 50:
+        return [1, 5, 10, 20, 30, 40, 50, n_itr]
+    elif n_itr <= 100:
+        return [1, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, n_itr]
+    elif n_itr <= 200:
+        return [1, 5, 10, 20, 30, 40, 50, 75, 100, 125, 150, 175, 200, n_itr]
+    elif n_itr <= 500:
+        return [1, 5, 10, 20, 30, 40, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, n_itr]
     else:
+        return [1, 5, 10, 20, 30, 40, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500,
+                int(n_itr*0.6), int(n_itr*0.7), int(n_itr*0.8), int(n_itr*0.9), n_itr]
+def get_inference_configs(inference_type='IncreaseConfidence', eps=0.5, n_itr=50, step_size=1.0):
+    """Generate inference configuration with customizable parameters."""
+    config = {
+        'loss_infer': inference_type,
+        'n_itr': n_itr,
+        'eps': eps,
+        'step_size': step_size,
+        'diffusion_noise_ratio': 0.0,
+        'initial_inference_noise_ratio': 0.0,
+        'top_layer': 'all',
+        'inference_normalization': False,
+        'recognition_normalization': False,
+        'iterations_to_show': get_iterations_to_show(n_itr),
+        'misc_info': {'keep_grads': False}
+    }
+    if inference_type == 'IncreaseConfidence':
+        config['loss_function'] = 'CE'
+    elif inference_type == 'Prior-Guided Drift Diffusion':
+        config['loss_function'] = 'MSE'
+        config['initial_inference_noise_ratio'] = 0.05
+        config['diffusion_noise_ratio'] = 0.01
+        config['top_layer'] = 'layer4'
+    elif inference_type == 'GradModulation':
+        config['loss_function'] = 'CE'
+        config['misc_info']['grad_modulation'] = 0.5
+    elif inference_type == 'CompositionalFusion':
+        config['loss_function'] = 'CE'
+        config['misc_info']['positive_classes'] = []
+        config['misc_info']['negative_classes'] = []
+    return config
+def get_model_preprocessing(model_type: str) -> Dict:
+    """Get preprocessing configuration for specific model type."""
+    if model_type not in MODEL_CONFIGS:
+        print(f"Fall-back: Unknown model type {model_type}, using ImageNet defaults")
+        return MODEL_CONFIGS['resnet50_standard']
+    return MODEL_CONFIGS[model_type]
 class NormalizeByChannelMeanStd(nn.Module):
+    """Normalization layer for models."""
     def __init__(self, mean, std):
         super(NormalizeByChannelMeanStd, self).__init__()
         if not isinstance(mean, torch.Tensor):
     def normalize_fn(self, tensor, mean, std):
         """Differentiable version of torchvision.functional.normalize"""
         mean = mean[None, :, None, None]
         std = std[None, :, None, None]
         return tensor.sub(mean).div(std)
 class InferStep:
+    """Inference step class for gradient-based optimization."""
+    def __init__(self, orig_image: torch.Tensor, eps: float, step_size: float):
         self.orig_image = orig_image
         self.eps = eps
         self.step_size = step_size
+    def project(self, x: torch.Tensor) -> torch.Tensor:
+        """Project x onto epsilon-ball around original image."""
         diff = x - self.orig_image
         diff = torch.clamp(diff, -self.eps, self.eps)
         return torch.clamp(self.orig_image + diff, 0, 1)
+    def step(self, x: torch.Tensor, grad: torch.Tensor) -> torch.Tensor:
+        """Take a normalized gradient step."""
+        dim = len(x.shape) - 1
+        grad_norm = torch.norm(grad.reshape(grad.shape[0], -1), dim=1).reshape(-1, *([1] * dim))
         scaled_grad = grad / (grad_norm + 1e-10)
         return scaled_grad * self.step_size
+def extract_middle_layers(model: nn.Module, layer_index: Union[str, int]) -> nn.Module:
+    """Extract middle layers from a model up to a specified layer index."""
+    if isinstance(layer_index, str) and layer_index == 'all':
+        return model
+    # Handle ResNet layer extraction
+    modules = list(model.named_children())
+    cutoff_idx = next(
+        (i for i, (name, _) in enumerate(modules) if name == str(layer_index)),
+        None
+    )
+    if cutoff_idx is not None:
+        new_model = nn.Sequential(OrderedDict(modules[:cutoff_idx + 1]))
+        return new_model
+    else:
+        print(f"Fall-back: Module {layer_index} not found, using full model")
+        return model
+def calculate_loss(output_model: torch.Tensor, class_indices: List[int], loss_inference: str) -> torch.Tensor:
+    """Calculate loss for specified class indices."""
+    losses = []
+    for idx in class_indices:
+        target = torch.full((1,), idx, dtype=torch.long, device=output_model.device)
+        if loss_inference == 'CE':
+            loss = nn.CrossEntropyLoss()(output_model, target)
+        elif loss_inference == 'MSE':
+            one_hot_target = torch.zeros_like(output_model)
+            one_hot_target[0, target] = 1
+            loss = nn.MSELoss()(output_model, one_hot_target)
+        else:
+            raise ValueError(f"Unsupported loss_inference: {loss_inference}")
+        losses.append(loss)
+    return torch.stack(losses).mean()
+def download_model(model_type):
+    """Download model if needed."""
+    if model_type not in MODEL_URLS or MODEL_URLS[model_type] is None:
+        return None
+    os.makedirs("models", exist_ok=True)
+    if model_type == 'resnet50_robust_face':
+        model_path = Path("models/resnet50_vggface2_L2_eps_0.50_checkpoint150.pt")
+    else:
+        model_path = Path(f"models/{model_type}.pt")
+    if not model_path.exists():
+        print(f"Downloading {model_type} model...")
+        url = MODEL_URLS[model_type]
+        response = requests.get(url, stream=True)
+        if response.status_code == 200:
+            with open(model_path, 'wb') as f:
+                for chunk in response.iter_content(chunk_size=8192):
+                    f.write(chunk)
+            print(f"Model downloaded and saved to {model_path}")
+        else:
+            raise RuntimeError(f"Failed to download model: {response.status_code}")
+    return model_path
 class GenerativeInferenceModel:
+    """Complete generative inference model with model loading and inference."""
     def __init__(self):
         self.models = {}
         self.model_preproc = {}
+        self.labels = self.get_imagenet_labels()
+    def get_imagenet_labels(self):
+        """Get ImageNet labels."""
+        url = "https://raw.githubusercontent.com/anishathalye/imagenet-simple-labels/master/imagenet-simple-labels.json"
         try:
+            response = requests.get(url)
+            if response.status_code == 200:
+                return response.json()
+            else:
+                print("Fall-back: Failed to fetch ImageNet labels, using placeholder")
+                return [f"class_{i}" for i in range(1000)]
         except Exception as e:
+            print(f"Fall-back: Error fetching labels: {e}")
+            return [f"class_{i}" for i in range(1000)]
     def load_model(self, model_type):
+        """Load and cache models for different model types."""
         if model_type in self.models:
             print(f"Using cached {model_type} model")
             return self.models[model_type]
         start_time = time.time()
+        # Get model-specific preprocessing config
+        preproc_config = get_model_preprocessing(model_type)
+        self.model_preproc[model_type] = preproc_config
+        # Create normalizer
+        normalizer = NormalizeByChannelMeanStd(
+            preproc_config['norm_mean'],
+            preproc_config['norm_std']
+        ).to(device)
+        # Create base model architecture
+        num_classes = preproc_config['n_classes']
+        resnet = models.resnet50(num_classes=num_classes)
         model = nn.Sequential(normalizer, resnet)
+        # Download and load checkpoint
+        model_path = download_model(model_type)
         if model_path:
             print(f"Loading {model_type} model from {model_path}...")
             try:
                 checkpoint = torch.load(model_path, map_location=device)
                 # Handle different checkpoint formats
                 if 'model' in checkpoint:
                     state_dict = checkpoint['model']
                     print("Using 'model' key from checkpoint")
                 elif 'state_dict' in checkpoint:
                     state_dict = checkpoint['state_dict']
                     print("Using 'state_dict' key from checkpoint")
                 else:
                     state_dict = checkpoint
                     print("Using checkpoint directly as state_dict")
+                # Extract ResNet state dict
                 resnet_state_dict = {}
                 resnet_keys = set(resnet.state_dict().keys())
+                # For face model, prioritize 'module.model.model.' structure (seen in actual checkpoint)
+                if model_type == 'resnet50_robust_face':
+                    # Check for 'module.model.model.' structure first (face checkpoints use this)
+                    module_model_model_keys = [key for key in state_dict.keys() if key.startswith('module.model.model.')]
+                    if module_model_model_keys:
+                        print(f"Found 'module.model.model.' structure with {len(module_model_model_keys)} parameters (face model)")
                         for source_key, value in state_dict.items():
+                            if source_key.startswith('module.model.model.'):
+                                target_key = source_key[len('module.model.model.'):]
+                                if target_key in resnet_keys:
                                     resnet_state_dict[target_key] = value
+                        print(f"Extracted {len(resnet_state_dict)} parameters from module.model.model.")
+                    # Also check for 'module.model.' structure as fallback
+                    if len(resnet_state_dict) < len(resnet_keys):
+                        module_model_keys = [key for key in state_dict.keys() if key.startswith('module.model.') and not key.startswith('module.model.model.')]
+                        if module_model_keys:
+                            print(f"Found additional 'module.model.' structure with {len(module_model_keys)} parameters")
+                            for source_key, value in state_dict.items():
+                                if source_key.startswith('module.model.') and not source_key.startswith('module.model.model.'):
+                                    target_key = source_key[len('module.model.'):]
+                                    # Remove extra 'model.' if present
+                                    if target_key.startswith('model.'):
+                                        target_key = target_key[len('model.'):]
+                                    if target_key in resnet_keys and target_key not in resnet_state_dict:
+                                        resnet_state_dict[target_key] = value
+                            print(f"Now have {len(resnet_state_dict)} parameters after adding module.model. keys")
+                # Handle different key prefixes in checkpoints (for other models)
+                if len(resnet_state_dict) == 0:
+                    prefixes_to_try = ['', 'module.', 'model.', 'attacker.model.', 'attacker.']
                     for source_key, value in state_dict.items():
                         target_key = source_key
                         # Try removing various prefixes
+                        for prefix in prefixes_to_try:
                             if source_key.startswith(prefix):
                                 target_key = source_key[len(prefix):]
                                 break
+                        # Handle nested model keys
+                        if target_key.startswith('model.'):
+                            target_key = target_key[len('model.'):]
+                        # If the target key is in ResNet keys, add it
+                        if target_key in resnet_keys:
                             resnet_state_dict[target_key] = value
+                # Load the state dict
                 if resnet_state_dict:
+                    result = resnet.load_state_dict(resnet_state_dict, strict=False)
+                    missing_keys, unexpected_keys = result
+                    loaded_percent = (len(resnet_state_dict) / len(resnet_keys)) * 100
+                    print(f"Model loading: {len(resnet_state_dict)}/{len(resnet_keys)} parameters ({loaded_percent:.1f}%)")
+                    if loaded_percent < 50:
+                        print(f"Fall-back: Loading too incomplete ({loaded_percent:.1f}%), using PyTorch pretrained")
+                        if model_type != 'resnet50_robust_face':
                             resnet = models.resnet50(weights=ResNet50_Weights.IMAGENET1K_V1)
                             model = nn.Sequential(normalizer, resnet)
+                else:
+                    print("Fall-back: No matching keys found in checkpoint, using PyTorch pretrained")
+                    if model_type != 'resnet50_robust_face':
+                        resnet = models.resnet50(weights=ResNet50_Weights.IMAGENET1K_V1)
+                        model = nn.Sequential(normalizer, resnet)
             except Exception as e:
+                print(f"Fall-back: Error loading checkpoint: {e}")
+                if model_type != 'resnet50_robust_face':
+                    print("Fall-back: Using PyTorch pretrained model")
+                    resnet = models.resnet50(weights=ResNet50_Weights.IMAGENET1K_V1)
+                    model = nn.Sequential(normalizer, resnet)
+                else:
+                    print("Fall-back: Face model checkpoint failed, model may not work properly")
+        else:
+            # Use PyTorch's pretrained model for ImageNet models
+            if model_type != 'resnet50_robust_face':
+                print(f"No checkpoint for {model_type}, using PyTorch pretrained")
                 resnet = models.resnet50(weights=ResNet50_Weights.IMAGENET1K_V1)
                 model = nn.Sequential(normalizer, resnet)
+            else:
+                print("Fall-back: Face model requires checkpoint, model may not work properly")
         model = model.to(device)
+        model.eval()
+        # Verify model
         self.verify_model_integrity(model, model_type)
+        # Cache the model
         self.models[model_type] = model
         end_time = time.time()
+        print(f"Model {model_type} loaded in {end_time - start_time:.2f} seconds")
         return model
+    def verify_model_integrity(self, model, model_type):
+        """Verify model integrity."""
+        try:
+            print(f"Fall-back: Running model integrity check for {model_type}")
+            config = get_model_preprocessing(model_type)
+            H = W = config['input_size']
+            test_input = torch.zeros(1, 3, H, W, device=device)
+            test_input[0, 0, H//4:3*H//4, W//4:3*W//4] = 0.5
+            with torch.no_grad():
+                output = model(test_input)
+            expected_classes = config['n_classes']
+            if output.shape != (1, expected_classes):
+                print(f"Fall-back: Unexpected output shape: {output.shape}, expected (1, {expected_classes})")
+                return False
+            probs = torch.nn.functional.softmax(output, dim=1)
+            confidence, prediction = torch.max(probs, 1)
+            print(f"Model integrity check passed:")
+            print(f"- Output shape: {output.shape}")
+            print(f"- Top prediction: Class {prediction.item()} with {confidence.item()*100:.2f}% confidence")
+            return True
+        except Exception as e:
+            print(f"Fall-back: Model integrity check failed with error: {e}")
+            return False
     def inference(self, image, model_type, config):
+        """Run generative inference."""
         inference_start = time.time()
+        # Load the model
         model = self.load_model(model_type)
+        # Handle image input
         if isinstance(image, str):
             if os.path.exists(image):
                 image = Image.open(image).convert('RGB')
             else:
                 raise ValueError(f"Image path does not exist: {image}")
+        elif isinstance(image, np.ndarray):
+            if image.dtype != np.uint8:
+                if image.max() <= 1.0:
+                    image = (image * 255).astype(np.uint8)
+                else:
+                    image = image.astype(np.uint8)
+            if len(image.shape) == 3:
+                if image.shape[0] == 3 or image.shape[0] == 1:
+                    image = np.transpose(image, (1, 2, 0))
+                if image.shape[2] == 4:
+                    image = image[:, :, :3]
+                elif image.shape[2] == 1:
+                    image = np.repeat(image, 3, axis=2)
+            image = Image.fromarray(image)
+        elif not isinstance(image, Image.Image):
+            try:
+                image = Image.fromarray(np.array(image)).convert('RGB')
+            except Exception as e:
+                raise ValueError(f"Cannot convert image type {type(image)} to PIL Image: {e}")
+        if isinstance(image, Image.Image) and image.mode != 'RGB':
+            image = image.convert('RGB')
+        # Get preprocessing config
+        preproc_config = get_model_preprocessing(model_type)
+        input_size = preproc_config['input_size']
+        norm_mean = torch.tensor(preproc_config['norm_mean'])
+        norm_std = torch.tensor(preproc_config['norm_std'])
+        n_classes = preproc_config['n_classes']
+        # Create transform
+        if config.get('inference_normalization', False):
+            transform = transforms.Compose([
+                transforms.Resize(input_size),
+                transforms.CenterCrop(input_size),
+                transforms.ToTensor(),
+                transforms.Normalize(norm_mean.tolist(), norm_std.tolist()),
+            ])
+            print(f"Fall-back: Using normalization with mean={norm_mean.tolist()}, std={norm_std.tolist()}")
         else:
+            transform = transforms.Compose([
+                transforms.Resize(input_size),
+                transforms.CenterCrop(input_size),
+                transforms.ToTensor(),
+            ])
+            print(f"Normalization OFF - feeding raw [0,1] tensors to model (normalization applied in the model)")
+        # Helper function to safely apply transform with fallback for numpy compatibility
+        def safe_transform(img):
+            try:
+                return transform(img)
+            except TypeError as e:
+                if "expected np.ndarray" in str(e) or "got numpy.ndarray" in str(e):
+                    # Fallback: manually convert PIL to tensor
+                    print(f"[WARNING] Transform failed with numpy compatibility issue, using manual conversion")
+                    # Apply resize and center crop manually
+                    resize_transform = transforms.Resize(input_size)
+                    crop_transform = transforms.CenterCrop(input_size)
+                    img = crop_transform(resize_transform(img))
+                    # Convert to numpy array and then to tensor using torch.tensor() to avoid numpy compatibility issues
+                    img_array = np.array(img, dtype=np.uint8)
+                    # Use torch.tensor() instead of torch.from_numpy() to avoid compatibility issues
+                    # Convert to float and normalize to [0, 1], then convert from HWC to CHW format
+                    img_tensor = torch.tensor(img_array, dtype=torch.float32).div(255.0).permute(2, 0, 1)
+                    # Apply normalization if needed
+                    if config.get('inference_normalization', False):
+                        img_tensor = transforms.Normalize(norm_mean.tolist(), norm_std.tolist())(img_tensor)
+                    return img_tensor
+                else:
+                    raise
+        # Prepare image tensor with safe transform
+        image_tensor = safe_transform(image).unsqueeze(0).to(device)
+        image_tensor.requires_grad = True
+        # Get model components
+        is_sequential = isinstance(model, nn.Sequential)
+        if is_sequential and isinstance(model[0], NormalizeByChannelMeanStd):
+            core_model = model[1]
+        else:
+            core_model = model
+        # Prepare model for layer extraction
+        if config.get('top_layer', 'all') != 'all':
+            new_model = extract_middle_layers(core_model, config['top_layer'])
+        else:
+            new_model = model
         # Get original predictions
         with torch.no_grad():
+            if config.get('inference_normalization', False):
+                output_original = model(image_tensor)
             else:
+                output_original = core_model(image_tensor)
             probs_orig = F.softmax(output_original, dim=1)
             conf_orig, classes_orig = torch.max(probs_orig, 1)
+            # Get least confident classes for IncreaseConfidence
+            if config['loss_infer'] == 'IncreaseConfidence':
+                _, least_confident_classes = torch.topk(probs_orig, k=int(n_classes / 10), largest=False)
+        # Setup for Prior-Guided Drift Diffusion
+        noisy_features = None
+        if config['loss_infer'] == 'Prior-Guided Drift Diffusion':
+            print(f"Setting up Prior-Guided Drift Diffusion...")
+            added_noise = config.get('initial_inference_noise_ratio', 0.05) * torch.randn_like(image_tensor).to(device)
+            noisy_image_tensor = image_tensor + added_noise
+            noisy_features = new_model(noisy_image_tensor)
         # Initialize inference step
         infer_step = InferStep(image_tensor, config['eps'], config['step_size'])
         # Storage for inference steps
         x = image_tensor.clone().detach().requires_grad_(True)
         all_steps = [image_tensor[0].detach().cpu()]
+        selected_inferred_patterns = []
+        perceived_categories = []
+        confidence_list = []
         # Main inference loop
         print(f"Starting inference loop with {config['n_itr']} iterations for {config['loss_infer']}...")
         for i in range(config['n_itr']):
             # Reset gradients
             x.grad = None
+            if i == 0:
+                # Get predictions for first iteration
+                if config.get('inference_normalization', False):
+                    output = model(x)
+                else:
+                    output = core_model(x)
+                if isinstance(output, torch.Tensor) and output.size(-1) == n_classes:
+                    probs = F.softmax(output, dim=1)
+                    conf, classes = torch.max(probs, 1)
                 else:
+                    probs = 0
+                    conf = 0
+                    classes = 'N/A'
+            else:
+                # Calculate loss and gradients
+                try:
+                    # Forward pass through new_model for feature extraction
+                    features = new_model(x)
+                    if config['loss_infer'] == 'Prior-Guided Drift Diffusion':
+                        assert config.get('loss_function', 'MSE') == 'MSE', "Prior-Guided Drift Diffusion requires MSE loss"
+                        if noisy_features is not None:
+                            loss = F.mse_loss(features, noisy_features)
+                            grad = torch.autograd.grad(loss, x)[0]
+                            adjusted_grad = infer_step.step(x, grad)
+                        else:
+                            raise ValueError("Noisy features not computed for Prior-Guided Drift Diffusion")
+                    elif config['loss_infer'] == 'IncreaseConfidence':
+                        # Calculate loss using least confident classes
+                        num_target_classes = min(int(n_classes / 10), least_confident_classes.size(1))
+                        target_classes = least_confident_classes[0, :num_target_classes]
+                        loss = calculate_loss(features, target_classes.tolist(), config.get('loss_function', 'CE'))
+                        grad = torch.autograd.grad(loss, x, retain_graph=True)[0]
+                        adjusted_grad = infer_step.step(x, grad)
+                    else:
+                        raise ValueError(f"Loss inference method {config['loss_infer']} not supported")
+                    if grad is None:
+                        print("Fall-back: Direct gradient calculation failed")
+                        random_noise = (torch.rand_like(x) - 0.5) * 2 * config['step_size']
+                        x = infer_step.project(x.clone() + random_noise)
+                    else:
+                        # Add diffusion noise if specified
+                        diffusion_noise = config.get('diffusion_noise_ratio', 0.0) * torch.randn_like(x).to(device)
+                        x = infer_step.project(x.clone() + adjusted_grad + diffusion_noise)
+                except Exception as e:
+                    print(f"Fall-back: Error in gradient calculation: {e}")
+                    random_noise = (torch.rand_like(x) - 0.5) * 2 * config['step_size']
+                    x = infer_step.project(x.clone() + random_noise)
             # Store step if in iterations_to_show
+            if i+1 in config.get('iterations_to_show', []) or i+1 == config['n_itr']:
                 all_steps.append(x[0].detach().cpu())
+                selected_inferred_patterns.append(x[0].detach().cpu())
+                # Get current predictions
+                with torch.no_grad():
+                    if config.get('inference_normalization', False):
+                        current_output = model(x)
+                    else:
+                        current_output = core_model(x)
+                    if isinstance(current_output, torch.Tensor) and current_output.size(-1) == n_classes:
+                        current_probs = F.softmax(current_output, dim=1)
+                        current_conf, current_classes = torch.max(current_probs, 1)
+                        perceived_categories.append(current_classes.item())
+                        confidence_list.append(current_conf.item())
+                    else:
+                        perceived_categories.append('N/A')
+                        confidence_list.append(0.0)
+        # Final predictions
         with torch.no_grad():
+            if config.get('inference_normalization', False):
+                final_output = model(x)
             else:
+                final_output = core_model(x)
             final_probs = F.softmax(final_output, dim=1)
             final_conf, final_classes = torch.max(final_probs, 1)
             total_time = time.time() - inference_start
             print(f"Original top class: {classes_orig.item()} ({conf_orig.item():.4f})")
             print(f"Final top class: {final_classes.item()} ({final_conf.item():.4f})")
             print(f"Total inference time: {total_time:.2f} seconds")
+        # Return results in Code 1 format
         return {
             'final_image': x[0].detach().cpu(),
             'steps': all_steps,
             'original_class': classes_orig.item(),
             'original_confidence': conf_orig.item(),
             'final_class': final_classes.item(),
+            'final_confidence': final_conf.item(),
+            'all_categories': perceived_categories,
+            'all_confidences': confidence_list,
         }
 def show_inference_steps(steps, figsize=(15, 10)):
+    """Show inference steps using matplotlib."""
+    try:
+        import matplotlib.pyplot as plt
+        n_steps = len(steps)
+        fig, axes = plt.subplots(1, n_steps, figsize=figsize)
+        if n_steps == 1:
+            axes = [axes]
+        for i, step_img in enumerate(steps):
+            if isinstance(step_img, torch.Tensor):
+                img = step_img.permute(1, 2, 0).numpy()
+                img = np.clip(img, 0, 1)
+            else:
+                img = step_img
+            axes[i].imshow(img)
+            axes[i].set_title(f"Step {i+1}")
+            axes[i].axis('off')
+        plt.tight_layout()
+        return fig
+    except ImportError:
+        print("Fall-back: matplotlib not available for visualization")
+        return None
+    except Exception as e:
+        print(f"Fall-back: Visualization failed: {e}")
+        return None
+# Export the main classes and functions
+__all__ = ['GenerativeInferenceModel', 'get_inference_configs', 'show_inference_steps']

logs/model_loading_resnet50_robust_face.log CHANGED Viewed

@@ -2,45 +2,8 @@
 ===== MODEL LOADING REPORT: resnet50_robust_face =====
 Total parameters in checkpoint: 320
 Total parameters in model: 320
-Missing keys: 267 parameters
-Unexpected keys: 320 parameters
-Successfully loaded: 0 parameters (0.0%)
-Loading status: ❌ FAILED - Critical parameters missing, will not function properly
-⚠️ WARNING: Loading from checkpoint is too incomplete.
-⚠️ Falling back to PyTorch's pretrained model to avoid broken inference.
-✅ Successfully loaded PyTorch's pretrained ResNet50 model
-Missing keys by layer type:
-  layer3: 95 parameters
-  layer2: 65 parameters
-  layer1: 50 parameters
-  layer4: 50 parameters
-  bn1: 4 parameters
-  fc: 2 parameters
-  conv1: 1 parameters
-First 10 missing keys:
-  1. bn1.bias
-  2. bn1.running_mean
-  3. bn1.running_var
-  4. bn1.weight
-  5. conv1.weight
-  6. fc.bias
-  7. fc.weight
-  8. layer1.0.bn1.bias
-  9. layer1.0.bn1.running_mean
-  10. layer1.0.bn1.running_var
-First 10 unexpected keys:
-  1. model.bn1.bias
-  2. model.bn1.num_batches_tracked
-  3. model.bn1.running_mean
-  4. model.bn1.running_var
-  5. model.bn1.weight
-  6. model.conv1.weight
-  7. model.fc.bias
-  8. model.fc.weight
-  9. model.layer1.0.bn1.bias
-  10. model.layer1.0.bn1.num_batches_tracked
 ========================================

 ===== MODEL LOADING REPORT: resnet50_robust_face =====
 Total parameters in checkpoint: 320
 Total parameters in model: 320
+Missing keys: 0 parameters
+Unexpected keys: 0 parameters
+Successfully loaded: 320 parameters (100.0%)
+Loading status: ✅ COMPLETE - All important parameters loaded
 ========================================

models/resnet50_imagenet_L2_eps_0.50_checkpoint150.pt DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:40bfb9a204f1d9a305ed6374acbfc55fe2745433cf1e421952d4b461f577486a
-size 196695413

models/resnet50_robust.pt DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:380b14e6f9750bffa1447cf7017f65da4dc5ce71a3dd112f107515dcf7b14d9d
-size 204818947

models/resnet50_robust_face_100_checkpoint.pt DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:c48a5c16ca0d5ac4cb20f1b98e2128838746f18b658728ac661f1ffd589c37bf
-size 196695413

models/robust_resnet50.pt DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:380b14e6f9750bffa1447cf7017f65da4dc5ce71a3dd112f107515dcf7b14d9d
-size 204818947

models/standard_resnet50.pt DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:72d4a99582db5d7fa86c3fd2a089f0bfd6a10f69d635bca51f6ad72ac6b458f0
-size 204818947

stimuli/RandomizedPhaseOvalGray.png ADDED Viewed