ttoosi commited on
Commit
89de4e8
·
1 Parent(s): 29a9491

checkpoint first presentable hallucination demo

Browse files
Files changed (2) hide show
  1. README.md +53 -18
  2. app.py +156 -42
README.md CHANGED
@@ -34,39 +34,74 @@ This tool uses **generative inference** with adversarially robust neural network
34
 
35
  ## Usage
36
 
37
- 1. **Select an example illusion** or upload your own image
38
- 2. **Click "Load Parameters"** to set optimal prediction settings
39
- 3. **Click "Run Generative Inference"** to predict the hallucination
40
- 4. **View the results**: The model will show what perceptual effects it predicts humans will experience
 
41
 
42
  ## Scientific Background
43
 
44
  This demo is based on research showing that adversarially robust neural networks develop perceptual representations similar to human vision. By using generative inference (optimizing images to maximize model confidence), we can reveal what perceptual structures the network expects to see—which often matches what humans hallucinate or perceive in ambiguous images.
45
 
 
 
 
 
 
46
  ## Installation
47
 
48
- To run this demo locally:
49
 
50
- ```bash
51
- # Clone the repository
52
- git clone https://huggingface.co/spaces/ttoosi/Human_Hallucination_Prediction
53
- cd Human_Hallucination_Prediction
 
 
 
 
 
 
 
 
 
 
 
 
 
54
 
55
- # Install dependencies
56
- pip install -r requirements.txt
57
 
58
- # Run the app
59
- python app.py
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
60
  ```
61
 
62
- The web app will be available at http://localhost:7860.
63
 
64
  ## The Prediction Process
65
 
66
- 1. **Input**: Start with an ambiguous or illusion-inducing image
67
- 2. **Generative Inference**: The robust neural network iteratively modifies the image to maximize its confidence
68
- 3. **Prediction**: The modifications reveal what perceptual structures the network expects—predicting what humans will hallucinate
69
- 4. **Visualization**: View the predicted hallucination emerging step-by-step
70
 
71
  ## Models
72
 
 
34
 
35
  ## Usage
36
 
37
+ 1. **Choose an input**: Pick a pre-configured example illusion from the dropdown, or upload your own image.
38
+ 2. **Load Parameters**: Click **"Load Parameters"** to fill in optimal prediction settings for that example (or adjust them manually).
39
+ 3. **Select the affected part of the visual field**: Set where the model should focus by clicking on the input image or the mask preview to define the mask center, then adjust **Mask center X/Y**, **Mask radius**, and **Mask sigma** in the Adaptive Gaussian mask section if needed. The preview circle shows the region that will receive stronger constraint during inference.
40
+ 4. **Run inference**: Click **"Run Generative Inference"** to start the prediction. Progress and intermediate steps are shown in the interface.
41
+ 5. **View results**: Inspect the predicted perceptual effects, visualizations, and any generated outputs in the result panels.
42
 
43
  ## Scientific Background
44
 
45
  This demo is based on research showing that adversarially robust neural networks develop perceptual representations similar to human vision. By using generative inference (optimizing images to maximize model confidence), we can reveal what perceptual structures the network expects to see—which often matches what humans hallucinate or perceive in ambiguous images.
46
 
47
+ ## Prerequisites
48
+
49
+ - **Python** 3.8 or higher
50
+ - **pip** (Python package manager)
51
+
52
  ## Installation
53
 
54
+ 1. **Clone the repository**
55
 
56
+ ```bash
57
+ git clone https://huggingface.co/spaces/ttoosi/Human_Hallucination_Prediction
58
+ cd Human_Hallucination_Prediction
59
+ ```
60
+
61
+ 2. **Create a virtual environment** (recommended)
62
+
63
+ ```bash
64
+ python -m venv venv
65
+ source venv/bin/activate # On Windows: venv\Scripts\activate
66
+ ```
67
+
68
+ 3. **Install dependencies**
69
+
70
+ ```bash
71
+ pip install -r requirements.txt
72
+ ```
73
 
74
+ 4. **Run the app**
 
75
 
76
+ ```bash
77
+ python app.py
78
+ ```
79
+
80
+ Optional: specify a port with `--port` (default is 7860):
81
+
82
+ ```bash
83
+ python app.py --port 8861
84
+ ```
85
+
86
+ The web app will be available at **http://localhost:7860** (or the port you specified).
87
+
88
+ **Note:** Model weights (e.g. robust ResNet50) are downloaded automatically from Hugging Face on first run and cached in the `models/` directory. The app also creates a `stimuli/` directory for example images.
89
+
90
+ ### Running with Docker
91
+
92
+ ```bash
93
+ docker build -t human-hallucination-prediction .
94
+ docker run -p 7860:7860 human-hallucination-prediction
95
  ```
96
 
97
+ Then open http://localhost:7860 in your browser.
98
 
99
  ## The Prediction Process
100
 
101
+ 1. **Input**: You provide an ambiguous or illusion-inducing image (or use a built-in example).
102
+ 2. **Generative inference**: The adversarially robust network iteratively updates the image to maximize its confidence, guided by your chosen parameters (model, layer, noise, step size, etc.).
103
+ 3. **Prediction**: The resulting changes reveal the perceptual structures the network expects—which correspond to what humans tend to hallucinate or perceive in such images.
104
+ 4. **Visualization**: The interface shows the predicted hallucination and intermediate steps as the optimization runs.
105
 
106
  ## Models
107
 
app.py CHANGED
@@ -33,6 +33,66 @@ model = GenerativeInferenceModel()
33
 
34
  # Define example images and their parameters with updated values from the research
35
  examples = [
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
  {
37
  "image": os.path.join("stimuli", "urbanoffice1.jpg"),
38
  "name": "UrbanOffice1",
@@ -80,7 +140,17 @@ examples = [
80
  "step_size": 1.0,
81
  "iterations": 101,
82
  "epsilon": 20.0
83
- }
 
 
 
 
 
 
 
 
 
 
84
  },
85
  {
86
  "image": os.path.join("stimuli", "Kanizsa_square.jpg"),
@@ -99,7 +169,17 @@ examples = [
99
  "step_size": 0.64,
100
  "iterations": 100,
101
  "epsilon": 5.0
102
- }
 
 
 
 
 
 
 
 
 
 
103
  },
104
  {
105
  "image": os.path.join("stimuli", "CornsweetBlock.png"),
@@ -119,7 +199,17 @@ examples = [
119
  "step_size": 0.8,
120
  "iterations": 51,
121
  "epsilon": 20.0
122
- }
 
 
 
 
 
 
 
 
 
 
123
  },
124
  {
125
  "image": os.path.join("stimuli", "face_vase.png"),
@@ -138,7 +228,17 @@ examples = [
138
  "step_size": 0.58,
139
  "iterations": 100,
140
  "epsilon": 0.81
141
- }
 
 
 
 
 
 
 
 
 
 
142
  },
143
  {
144
  "image": os.path.join("stimuli", "Confetti_illusion.png"),
@@ -157,7 +257,17 @@ examples = [
157
  "step_size": 0.5,
158
  "iterations": 101,
159
  "epsilon": 20.0
160
- }
 
 
 
 
 
 
 
 
 
 
161
  },
162
  {
163
  "image": os.path.join("stimuli", "EhresteinSingleColor.png"),
@@ -176,7 +286,17 @@ examples = [
176
  "step_size": 0.8,
177
  "iterations": 101,
178
  "epsilon": 20.0
179
- }
 
 
 
 
 
 
 
 
 
 
180
  },
181
  {
182
  "image": os.path.join("stimuli", "GroupingByContinuity.png"),
@@ -195,7 +315,17 @@ examples = [
195
  "step_size": 0.4,
196
  "iterations": 101,
197
  "epsilon": 4.0
198
- }
 
 
 
 
 
 
 
 
 
 
199
  },
200
  {
201
  "image": os.path.join("stimuli", "figure_ground.png"),
@@ -214,37 +344,17 @@ examples = [
214
  "step_size": 0.5,
215
  "iterations": 101,
216
  "epsilon": 3.0
217
- }
218
- },
219
- {
220
- "image": os.path.join("stimuli", "urbanoffice1.jpg"),
221
- "name": "UrbanOffice1",
222
- "wiki": "https://en.wikipedia.org/wiki/Visual_perception",
223
- "papers": [
224
- "[Adversarially Robust Vision](https://github.com/MadryLab/robustness)",
225
- "[Generative Inference](https://doi.org/10.1016/j.tics.2003.08.003)"
226
- ],
227
- "method": "Prior-Guided Drift Diffusion",
228
- "reverse_diff": {
229
- "model": "resnet50_robust",
230
- "layer": "all",
231
- "initial_noise": 1.0,
232
- "diffusion_noise": 0.002,
233
- "step_size": 1.0,
234
- "iterations": 500,
235
- "epsilon": 40.0
236
  },
237
- "inference_normalization": "off",
238
  "use_adaptive_eps": False,
239
- "use_adaptive_step": True,
240
- "mask_center_x": 0.5,
241
  "mask_center_y": 0.0,
242
  "mask_radius": 0.2,
243
- "mask_sigma": 0.2,
244
- "eps_max_mult": 20.0,
245
  "eps_min_mult": 1.0,
246
- "step_max_mult": 50.0,
247
- "step_min_mult": 0.2,
248
  }
249
  ]
250
 
@@ -398,6 +508,10 @@ def draw_mask_overlay(image, center_x, center_y, radius):
398
  # Helper function to apply example parameters (adaptive mask off by default unless example defines it)
399
  def apply_example(example):
400
  rd = example["reverse_diff"]
 
 
 
 
401
  return [
402
  example["image"],
403
  rd.get("model", "resnet50_robust"),
@@ -410,14 +524,15 @@ def apply_example(example):
410
  rd["layer"],
411
  example.get("use_adaptive_eps", False),
412
  example.get("use_adaptive_step", False),
413
- example.get("mask_center_x", 0.0),
414
- example.get("mask_center_y", 0.0),
415
  example.get("mask_radius", 0.3),
416
  example.get("mask_sigma", 0.2),
417
  example.get("eps_max_mult", 4.0),
418
  example.get("eps_min_mult", 1.0),
419
  example.get("step_max_mult", 4.0),
420
  example.get("step_min_mult", 1.0),
 
421
  gr.Group(visible=True),
422
  ]
423
 
@@ -433,17 +548,15 @@ with gr.Blocks(title="Human Hallucination Prediction", css="""
433
  }
434
  """) as demo:
435
  gr.Markdown("# Human Hallucination Prediction")
436
- gr.Markdown("**Predict what visual hallucinations humans will experience** using adversarially robust neural networks. This demo forecasts perceptual phenomena like illusory contours, figure-ground reversals, and Gestalt effects before humans report them.")
437
 
438
  gr.Markdown("""
439
  **How to predict hallucinations:**
440
- 1. **Select an example illusion** below and click "Load Parameters" to set optimal prediction settings
441
- 2. **Click "Run Generative Inference"** to predict what hallucination humans will perceive
442
  3. **View the prediction**: Watch as the model reveals the perceptual structures it expects—matching what humans typically hallucinate
443
- 4. **Upload your own images** to test if they will induce hallucinations in human observers
444
  """)
445
-
446
- # Main processing interface
447
  with gr.Row():
448
  with gr.Column(scale=1):
449
  # Inputs
@@ -503,7 +616,7 @@ with gr.Blocks(title="Human Hallucination Prediction", css="""
503
  mask_center_y_slider = gr.Slider(minimum=-1.0, maximum=1.0, value=0.0, step=0.05, label="Mask center Y")
504
  with gr.Row():
505
  mask_radius_slider = gr.Slider(minimum=0.01, maximum=1.0, value=0.2, step=0.01, label="Mask radius (flat region size)")
506
- mask_sigma_slider = gr.Slider(minimum=0.05, maximum=0.5, value=0.2, step=0.01, label="Mask sigma (fall-off outside radius)")
507
  with gr.Row():
508
  eps_max_mult_slider = gr.Slider(minimum=0.1, maximum=350.0, value=20.0, step=0.1, label="Epsilon: multiplier at center")
509
  eps_min_mult_slider = gr.Slider(minimum=0.1, maximum=10.0, value=1.0, step=0.1, label="Epsilon: multiplier at periphery")
@@ -542,6 +655,7 @@ with gr.Blocks(title="Human Hallucination Prediction", css="""
542
  mask_radius_slider, mask_sigma_slider,
543
  eps_max_mult_slider, eps_min_mult_slider,
544
  step_max_mult_slider, step_min_mult_slider,
 
545
  params_section,
546
  ],
547
  )
 
33
 
34
  # Define example images and their parameters with updated values from the research
35
  examples = [
36
+ {
37
+ "image": os.path.join("stimuli", "farm1.jpg"),
38
+ "name": "farm1",
39
+ "wiki": "https://en.wikipedia.org/wiki/Visual_perception",
40
+ "papers": [
41
+ "[Adversarially Robust Vision](https://github.com/MadryLab/robustness)",
42
+ "[Generative Inference](https://doi.org/10.1016/j.tics.2003.08.003)"
43
+ ],
44
+ "method": "Prior-Guided Drift Diffusion",
45
+ "reverse_diff": {
46
+ "model": "resnet50_robust",
47
+ "layer": "all",
48
+ "initial_noise": 0.0,
49
+ "diffusion_noise": 0.02,
50
+ "step_size": 1.0,
51
+ "iterations": 501,
52
+ "epsilon": 40.0
53
+ },
54
+ "inference_normalization": "off",
55
+ "use_adaptive_eps": False,
56
+ "use_adaptive_step": False,
57
+ "mask_center_x": 0.0,
58
+ "mask_center_y": 0.0,
59
+ "mask_radius": 0.2,
60
+ "mask_sigma": 0.3,
61
+ "eps_max_mult": 300.0,
62
+ "eps_min_mult": 1.0,
63
+ "step_max_mult": 10.0,
64
+ "step_min_mult": 1.0,
65
+ },
66
+ {
67
+ "image": os.path.join("stimuli", "ArtGallery1.jpg"),
68
+ "name": "ArtGallery1",
69
+ "wiki": "https://en.wikipedia.org/wiki/Visual_perception",
70
+ "papers": [
71
+ "[Adversarially Robust Vision](https://github.com/MadryLab/robustness)",
72
+ "[Generative Inference](https://doi.org/10.1016/j.tics.2003.08.003)"
73
+ ],
74
+ "method": "Prior-Guided Drift Diffusion",
75
+ "reverse_diff": {
76
+ "model": "resnet50_robust",
77
+ "layer": "layer4",
78
+ "initial_noise": 0.5,
79
+ "diffusion_noise": 0.002,
80
+ "step_size": 0.1,
81
+ "iterations": 501,
82
+ "epsilon": 40.0
83
+ },
84
+ "inference_normalization": "off",
85
+ "use_adaptive_eps": False,
86
+ "use_adaptive_step": True,
87
+ "mask_center_x": 0.0,
88
+ "mask_center_y": -1.0,
89
+ "mask_radius": 0.1,
90
+ "mask_sigma": 0.2,
91
+ "eps_max_mult": 30.0,
92
+ "eps_min_mult": 1.0,
93
+ "step_max_mult": 100.0,
94
+ "step_min_mult": 1.0,
95
+ },
96
  {
97
  "image": os.path.join("stimuli", "urbanoffice1.jpg"),
98
  "name": "UrbanOffice1",
 
140
  "step_size": 1.0,
141
  "iterations": 101,
142
  "epsilon": 20.0
143
+ },
144
+ "use_adaptive_eps": False,
145
+ "use_adaptive_step": False,
146
+ "mask_center_x": 0.0,
147
+ "mask_center_y": 0.0,
148
+ "mask_radius": 0.2,
149
+ "mask_sigma": 1.0,
150
+ "eps_max_mult": 1.0,
151
+ "eps_min_mult": 1.0,
152
+ "step_max_mult": 1.0,
153
+ "step_min_mult": 1.0,
154
  },
155
  {
156
  "image": os.path.join("stimuli", "Kanizsa_square.jpg"),
 
169
  "step_size": 0.64,
170
  "iterations": 100,
171
  "epsilon": 5.0
172
+ },
173
+ "use_adaptive_eps": False,
174
+ "use_adaptive_step": False,
175
+ "mask_center_x": 0.0,
176
+ "mask_center_y": 0.0,
177
+ "mask_radius": 0.2,
178
+ "mask_sigma": 1.0,
179
+ "eps_max_mult": 1.0,
180
+ "eps_min_mult": 1.0,
181
+ "step_max_mult": 1.0,
182
+ "step_min_mult": 1.0,
183
  },
184
  {
185
  "image": os.path.join("stimuli", "CornsweetBlock.png"),
 
199
  "step_size": 0.8,
200
  "iterations": 51,
201
  "epsilon": 20.0
202
+ },
203
+ "use_adaptive_eps": False,
204
+ "use_adaptive_step": False,
205
+ "mask_center_x": 0.0,
206
+ "mask_center_y": 0.0,
207
+ "mask_radius": 0.2,
208
+ "mask_sigma": 1.0,
209
+ "eps_max_mult": 1.0,
210
+ "eps_min_mult": 1.0,
211
+ "step_max_mult": 1.0,
212
+ "step_min_mult": 1.0,
213
  },
214
  {
215
  "image": os.path.join("stimuli", "face_vase.png"),
 
228
  "step_size": 0.58,
229
  "iterations": 100,
230
  "epsilon": 0.81
231
+ },
232
+ "use_adaptive_eps": False,
233
+ "use_adaptive_step": False,
234
+ "mask_center_x": 0.0,
235
+ "mask_center_y": 0.0,
236
+ "mask_radius": 0.2,
237
+ "mask_sigma": 1.0,
238
+ "eps_max_mult": 1.0,
239
+ "eps_min_mult": 1.0,
240
+ "step_max_mult": 1.0,
241
+ "step_min_mult": 1.0,
242
  },
243
  {
244
  "image": os.path.join("stimuli", "Confetti_illusion.png"),
 
257
  "step_size": 0.5,
258
  "iterations": 101,
259
  "epsilon": 20.0
260
+ },
261
+ "use_adaptive_eps": False,
262
+ "use_adaptive_step": False,
263
+ "mask_center_x": 0.0,
264
+ "mask_center_y": 0.0,
265
+ "mask_radius": 0.2,
266
+ "mask_sigma": 1.0,
267
+ "eps_max_mult": 1.0,
268
+ "eps_min_mult": 1.0,
269
+ "step_max_mult": 1.0,
270
+ "step_min_mult": 1.0,
271
  },
272
  {
273
  "image": os.path.join("stimuli", "EhresteinSingleColor.png"),
 
286
  "step_size": 0.8,
287
  "iterations": 101,
288
  "epsilon": 20.0
289
+ },
290
+ "use_adaptive_eps": False,
291
+ "use_adaptive_step": False,
292
+ "mask_center_x": 0.0,
293
+ "mask_center_y": 0.0,
294
+ "mask_radius": 0.2,
295
+ "mask_sigma": 1.0,
296
+ "eps_max_mult": 1.0,
297
+ "eps_min_mult": 1.0,
298
+ "step_max_mult": 1.0,
299
+ "step_min_mult": 1.0,
300
  },
301
  {
302
  "image": os.path.join("stimuli", "GroupingByContinuity.png"),
 
315
  "step_size": 0.4,
316
  "iterations": 101,
317
  "epsilon": 4.0
318
+ },
319
+ "use_adaptive_eps": False,
320
+ "use_adaptive_step": False,
321
+ "mask_center_x": 0.0,
322
+ "mask_center_y": 0.0,
323
+ "mask_radius": 0.2,
324
+ "mask_sigma": 1.0,
325
+ "eps_max_mult": 1.0,
326
+ "eps_min_mult": 1.0,
327
+ "step_max_mult": 1.0,
328
+ "step_min_mult": 1.0,
329
  },
330
  {
331
  "image": os.path.join("stimuli", "figure_ground.png"),
 
344
  "step_size": 0.5,
345
  "iterations": 101,
346
  "epsilon": 3.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
347
  },
 
348
  "use_adaptive_eps": False,
349
+ "use_adaptive_step": False,
350
+ "mask_center_x": 0.0,
351
  "mask_center_y": 0.0,
352
  "mask_radius": 0.2,
353
+ "mask_sigma": 1.0,
354
+ "eps_max_mult": 1.0,
355
  "eps_min_mult": 1.0,
356
+ "step_max_mult": 1.0,
357
+ "step_min_mult": 1.0,
358
  }
359
  ]
360
 
 
508
  # Helper function to apply example parameters (adaptive mask off by default unless example defines it)
509
  def apply_example(example):
510
  rd = example["reverse_diff"]
511
+ mcx = example.get("mask_center_x", 0.0)
512
+ mcy = example.get("mask_center_y", 0.0)
513
+ mrad = example.get("mask_radius", 0.3)
514
+ mask_img = draw_mask_overlay(example["image"], mcx, mcy, mrad)
515
  return [
516
  example["image"],
517
  rd.get("model", "resnet50_robust"),
 
524
  rd["layer"],
525
  example.get("use_adaptive_eps", False),
526
  example.get("use_adaptive_step", False),
527
+ mcx,
528
+ mcy,
529
  example.get("mask_radius", 0.3),
530
  example.get("mask_sigma", 0.2),
531
  example.get("eps_max_mult", 4.0),
532
  example.get("eps_min_mult", 1.0),
533
  example.get("step_max_mult", 4.0),
534
  example.get("step_min_mult", 1.0),
535
+ mask_img,
536
  gr.Group(visible=True),
537
  ]
538
 
 
548
  }
549
  """) as demo:
550
  gr.Markdown("# Human Hallucination Prediction")
551
+ gr.Markdown("**Predict what visual hallucinations humans may experience** using neural networks.")
552
 
553
  gr.Markdown("""
554
  **How to predict hallucinations:**
555
+ 1. **Select an example image** below and click "Load Parameters" to set the prediction settings
556
+ 2. **Click "Run Generative Inference"** to predict what hallucination humans may perceive
557
  3. **View the prediction**: Watch as the model reveals the perceptual structures it expects—matching what humans typically hallucinate
558
+ 4. **You can upload your own images**
559
  """)
 
 
560
  with gr.Row():
561
  with gr.Column(scale=1):
562
  # Inputs
 
616
  mask_center_y_slider = gr.Slider(minimum=-1.0, maximum=1.0, value=0.0, step=0.05, label="Mask center Y")
617
  with gr.Row():
618
  mask_radius_slider = gr.Slider(minimum=0.01, maximum=1.0, value=0.2, step=0.01, label="Mask radius (flat region size)")
619
+ mask_sigma_slider = gr.Slider(minimum=0.05, maximum=1.0, value=0.2, step=0.01, label="Mask sigma (fall-off outside radius)")
620
  with gr.Row():
621
  eps_max_mult_slider = gr.Slider(minimum=0.1, maximum=350.0, value=20.0, step=0.1, label="Epsilon: multiplier at center")
622
  eps_min_mult_slider = gr.Slider(minimum=0.1, maximum=10.0, value=1.0, step=0.1, label="Epsilon: multiplier at periphery")
 
655
  mask_radius_slider, mask_sigma_slider,
656
  eps_max_mult_slider, eps_min_mult_slider,
657
  step_max_mult_slider, step_min_mult_slider,
658
+ mask_preview,
659
  params_section,
660
  ],
661
  )