# DeepDream MLX: Agents ## 1. The Mission To resurrect the 2015 DeepDream aesthetic using modern 2025 Apple Silicon hardware, bypassing the need for archaic frameworks like Caffe or Torch7 by porting everything to native MLX. ## 2. Training & Fine-Tuning Plan (The "Punch-Card" Revival) In the "classic" days (Intel Caffe era), training a custom DeepDream model meant fine-tuning a GoogLeNet on a dataset of specific objects (e.g., slugs, eyes, cars) so the network would hallucinate *those specific things* when dreaming. **The Roadmap for MLX Training:** ### Phase 1: Dataset Prep The `dream-creator` logic (from ProGamerGov) is still sound. We need: 1. **Structure:** `dataset/class_name/*.jpg` (Standard PyTorch ImageFolder format). 2. **Cleaning:** Remove corrupt images, deduplicate. 3. **Resizing:** Resize to ~224x224 or 256x256. 4. **Stats:** Calculate Mean/StdDev. ### Phase 2: The Trainer (`train_dream.py`) We need to write a native MLX training loop. * **Base Model:** Load `googlenet_mlx.npz`. * **Architecture:** InceptionV1 (GoogLeNet). * **Layer Freezing:** - **Critical:** Freeze early layers (`conv1`, `conv2`, `inception3a/b`) to preserve the "visual vocabulary" (edges, textures). - **Train:** Retrain only the higher layers (`inception4c`, `inception5b`, `fc`) and the Auxiliary Classifiers. * **Auxiliary Classifiers:** Inception has two side-branches (`aux1`, `aux2`) used for training stability. We must support training these or stripping them. * **Loss:** Cross-Entropy. * **Optimizer:** SGD with Momentum (classic) or Adam. ### Phase 3: "Decorrelation" (The Secret Sauce) `dream-creator` confirms that "Color Decorrelation" is key. * **Matrix:** A 3x3 matrix calculated from the training set covariance. * **Effect:** "Whitens" the input image gradients during dreaming, preventing the image from converging to a mono-color blob. * **Implementation:** Port `data_tools/calc_cm.py` to MLX. ## 3. Animation & Video Strategy The "Zoom" video effect is the second pillar of DeepDream. * **Logic:** Feedback Loop. 1. Dream on Frame N. 2. Zoom (Scale + Crop center) Frame N to create Frame N+1. 3. Repeat. * **Implementation:** A dedicated `dream_video.py` script. * **Tech:** Use `scipy.ndimage.zoom` (same as original 2015 code) for the scaling, as MLX's `resize` might differ slightly in sub-pixel interpolation. ## 4. Available Models & Wishlist **Current:** * `alexnet`: The raw, chaotic ancestor. * `googlenet` (InceptionV1): The classic "slugs and dogs". * `vgg16/19`: The "painterly" style transfer beast. * `resnet50`: Modern, sharp, geometric. **Wishlist (To Convert):** * `inception_v3`: More refined hallucinations. * `googlenet_places365`: Hallucinates landscapes/interiors. (Verified working via `convert.py --download googlenet` when URL is fixed/found). ## 5. Hugging Face Hygiene * **Repo:** `NickMystic/DeepDream-MLX` * **LFS:** Track `*.npz`. * **Cleanup:** Ensure `toConvert/` is empty of large raw files. * **Banner:** `assets/deepdream_header.jpg`. --- *Docs derived from deep analysis of `dream-creator` and classic Caffe workflows.*