Spaces:

ymlin105
/

Coconut-MNIST

Sleeping

App Files Files Community

ymlin105 commited on Feb 16

Commit

58839b6

1 Parent(s): 6849757

feat: initial implementation of MNIST Hybrid SVD-CNN core

Browse files

Files changed (27) hide show

README.md +67 -124
experiments/01_phenomenon_diagnosis.py +91 -0
experiments/01_phenomenon_discovery.py +0 -226
experiments/02_mechanistic_analysis.py +102 -0
experiments/02_mnist_cnn_confusion.py +0 -68
experiments/03_mechanistic_investigation.py +0 -244
experiments/04_robustness_limit.py +0 -187
experiments/05_manifold_learning.py +0 -103
experiments/06_fashion_mnist_baseline.py +0 -115
experiments/07_fashion_cnn_verification.py +0 -145
experiments/08_hybrid_robustness.py +0 -253
experiments/09_fashion_hybrid_robustness.py +0 -189
experiments/10_ablation_study.py +0 -344
experiments/11_learning_curves.py +0 -228
experiments/12_roc_analysis.py +0 -291
experiments/13_per_class_metrics.py +0 -366
experiments/appendix_learning_curves.py +26 -0
experiments/appendix_per_class_metrics.py +53 -0
experiments/run_robustness_test.py +65 -0
src/__init__.py +0 -0
src/config.py +17 -0
src/exp_utils.py +68 -0
src/hybrid_model.py +45 -0
src/train_fashion.py +27 -0
src/train_models.py +72 -0
src/utils.py +116 -0
src/viz.py +195 -0

README.md CHANGED Viewed

@@ -10,151 +10,94 @@ app_file: app.py
 pinned: false
 ---
-# Linear vs. Non-linear Manifold Geometry: A Robustness Analysis
-[![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/ymlin105/Coconut-MNIST)
-This started from a course assignment on SVD-based MNIST classification. I noticed digit **3 vs 8** was a failure mode, which led to a 9-experiment investigation into *why* linear projections fail and *when* they can still be useful.
-- **Problem** — Linear models like SVD struggle to distinguish handwritten **3s** and **8s** because they prioritize global pixel variance and overlook the "topological gap" that characterizes a 3.
-- **Main Question** — When does a linear low-rank projection (SVD) improve robustness, and when does it destroy the features a non-linear model needs?
-- **What I did** — Diagnosed SVD failure on the **3 vs 8** pair, verified non-linear advantage with a CNN, then built and evaluated a **Hybrid SVD→CNN** pipeline under test-time corruptions.
-- **Key Takeaways** — SVD behaves like a low-pass filter; CNN attention is local/non-linear; Hybrid helps under high Gaussian noise on MNIST but fails on texture-heavy Fashion-MNIST.
-- **Full technical report →** [REPORT.md](./docs/REPORT.md)
-> **Beyond MNIST:** A denoiser layer is only useful if we can predict its failure modes. In medical imaging, satellite data, and any domain where a linear filter is tempting, knowing the failure boundary prevents accuracy collapse. This question generalizes beyond toy datasets.
-> **Key result:** Hybrid SVD→CNN achieves **+4.8 pp over standalone CNN** at σ=0.7 on MNIST; identifies a **24.6 pp failure boundary** on Fashion-MNIST — pinpointing when linear denoising helps vs. destroys accuracy.
-![Hybrid SVD→CNN Pipeline](docs/research_results/pipeline_diagram.png)
-*Figure: The Hybrid SVD→CNN pipeline. SVD reconstruction acts as a data-adapted low-pass filter before CNN feature extraction.*
-![Geometric Analysis](docs/research_results/fig_06_explainability.png)
-*Figure: CNNs (center) focus on the local topological gap, while SVD (right) hallucinates a closed loop to satisfy global variance.*
-> **[Try the live demo →](https://huggingface.co/spaces/ymlin105/Coconut-MNIST)** — Inject noise/blur in real time and compare SVD vs CNN vs Hybrid predictions side-by-side.
-<details>
-<summary><strong>Quick Start & Project Structure</strong></summary>
-**Tech stack:** Python 3.9 · PyTorch · scikit-learn · Streamlit · UMAP · Plotly
-```bash
-# 1. Create and activate conda environment
-conda create -n hybrid-svd python=3.9
-conda activate hybrid-svd
-# 2. Install dependencies
-pip install -r requirements.txt
-# 3. Train SVD + CNN models (~1 min, now with validation & early stopping!)
-python src/train_models.py
-# 4. Launch interactive dashboard
-streamlit run app.py
-# Optional: Run additional analysis
-python experiments/10_ablation_study.py      # Depth vs Non-linearity analysis
-python experiments/11_learning_curves.py     # Training dynamics visualization
-python experiments/12_roc_analysis.py        # ROC curves for 3 vs 8
-python experiments/13_per_class_metrics.py   # Detailed per-class metrics
-```
-> **What's New in v2.0**: Enhanced training with validation set splitting (80/20), early stopping (patience=3), reproducible random seeds (seed=42), and 4 new experiments. Key results: **Non-linearity alone provides +4.91 pp gain** (ablation study), **CNN achieves AUC=1.0 on 3vs8** (ROC analysis), **CNN reduces worst confusion from 6.5% to 2.2%** (per-class metrics). See [`docs/IMPROVEMENTS.md`](docs/IMPROVEMENTS.md) and [`docs/QUICKSTART.md`](docs/QUICKSTART.md) for details.
 ```
-Project Structure
-├── src/               Core modules: CNN, SVD layer, hybrid pipeline, training
-├── experiments/       13 self-contained scripts (01–13), ordered by narrative
-├── docs/              Full report (REPORT.md) + 20 figures and JSON metrics
-├── models/            Pretrained checkpoints (CNN for MNIST & Fashion-MNIST)
-└── app.py             Streamlit dashboard (live demo)
-```
-</details>
-## Approach
-```
-Diagnosis                  Mechanism                   Solution & Boundary
-─────────────────────      ─────────────────────        ─────────────────────
-SVD fails on 3 vs 8   →   Why? Grad-CAM + UMAP    →   Hybrid SVD→CNN pipeline
-(Exp 1–3)                  (Exp 4–7)                   + Fashion-MNIST stress test
-                                                        (Exp 8–9)
-```
-The Hybrid pipeline passes each input through a fixed SVD reconstruction (rank $k{=}20$) before the CNN classifier. SVD acts as a data-adapted low-pass filter — suppressing high-frequency noise while retaining structure aligned with the training manifold.
-## Case Study 1: Success on Low-Rank Manifolds (MNIST)
-> *See [REPORT.md](./docs/REPORT.md#experiment-8-hybrid-architecture-validation-the-solution) for full details.*
-On shape-based data, the **Hybrid** architecture acts as a denoiser, filtering noise while preserving structure.
-| Model            | Clean  | σ=0.3 | σ=0.5 | σ=0.7           |
-| ---------------- | ------ | ------ | ------ | ---------------- |
-| CNN              | 98.74% | 96.36% | 80.44% | 54.34%           |
-| SVD              | 88.12% | 80.87% | 64.60% | 51.30%           |
-| Blur+CNN         | 94.25% | 83.78% | 63.38% | 44.54%           |
-| **Hybrid** | 91.82% | 88.57% | 79.26% | **59.10%** |
-*Results are mean over 3 random seeds. Bold indicates where Hybrid surpasses the standalone CNN.*
-**Result**: The Hybrid improves over the standalone CNN at high noise ($\sigma=0.7$: 59.10% vs 54.34%), consistent with SVD reconstruction suppressing high-frequency noise before CNN feature extraction. The crossover point where Hybrid surpasses the CNN occurs between $\sigma=0.5$ and $\sigma=0.7$. The Hybrid also outperforms the Gaussian blur baseline (59.10% vs 44.54%), confirming that SVD provides data-adapted denoising beyond generic smoothing.
-![Robustness Curves](docs/research_results/fig_10_hybrid_robustness.png)
-*Figure: Accuracy vs. noise level (σ). The Hybrid (orange) crosses above the standalone CNN (blue) at high noise, confirming the denoising benefit on low-rank data.*
-## Case Study 2: Failure on Texture-Rich Manifolds (Fashion-MNIST)
-> *See [REPORT.md](./docs/REPORT.md#experiment-9-boundary-analysis-fashion-mnist) for the full boundary analysis.*
-> On texture-dependent data, SVD filtering destroys high-frequency details (e.g., collar vs. no collar), causing the Hybrid model to collapse.
-| Model            | MNIST (Clean) | Fashion-MNIST (Clean) |
-| ---------------- | ------------- | --------------------- |
-| **CNN**    | 98.74%        | **91.04%**      |
-| **Hybrid** | 91.82%        | **67.27%**      |
-**Boundary Identified**: This method shows a **~24.6-point** clean-accuracy drop on Fashion-MNIST (texture-dependent), confirming it is primarily suitable for low-rank, shape-defined manifolds.
-<details>
-<summary><strong>Geometric Mechanics (why SVD fails and when it helps)</strong></summary>
-- **Feature energy paradox**: discriminative cues can be "low-energy" (gaps, textures) and get wiped out by low-rank projection.
-- **Manifold alignment check**: UMAP shows 3/8 are separable when local neighborhood structure is preserved.
-- **Subspace denoising**: an SVD reconstruction step can act as a data-adapted low-pass filter before the CNN.
-</details>
-<details>
-<summary><strong>Evidence & Reproducibility</strong></summary>
-All figures and metrics are in [`docs/research_results/`](docs/research_results). Each experiment has a single self-contained script in [`experiments/`](experiments/) numbered `01`–`13`, ordered to follow the narrative (diagnosis → mechanism → solution → boundary → validation).
-<details>
-<summary>Full experiment list</summary>
-| #            | Script                             | What it produces                                       |
-| ------------ | ---------------------------------- | ------------------------------------------------------ |
-| 01           | `phenomenon_discovery.py`        | SVD failure analysis + spectrum                        |
-| 02           | `mnist_cnn_confusion.py`         | MNIST CNN confusion matrix                             |
-| 03           | `mechanistic_investigation.py`   | Interpolation + Grad-CAM vs reconstruction             |
-| 04           | `robustness_limit.py`            | SVD vs CNN degradation curves                          |
-| 05           | `manifold_learning.py`           | SVD vs UMAP manifold comparison                        |
-| 06           | `fashion_mnist_baseline.py`      | Fashion-MNIST SVD baseline                             |
-| 07           | `fashion_cnn_verification.py`    | Fashion-MNIST CNN confusion                            |
-| 08           | `hybrid_robustness.py`           | MNIST robustness +`robustness_mnist_noise.json`      |
-| 09           | `fashion_hybrid_robustness.py`   | Fashion robustness +`robustness_fashion_noise.json`  |
-| **10** | **`ablation_study.py`**    | **Depth vs Non-linearity contribution analysis** |
-| **11** | **`learning_curves.py`**   | **Training/validation dynamics visualization**   |
-| **12** | **`roc_analysis.py`**      | **ROC curves for 3 vs 8 classification**         |
-| **13** | **`per_class_metrics.py`** | **Precision/Recall/F1 per digit + CSV reports**  |
-## Limitations & Open Questions
-- **Clean-accuracy penalty**: The Hybrid trades ~7 pp of clean accuracy for high-noise robustness. Can adaptive rank selection ($k$ as a function of input noise estimate) eliminate this penalty?
-- **Texture-dependent failure**: The method collapses on Fashion-MNIST. Does this failure boundary generalize to other "texture vs shape" splits (e.g., CIFAR-10, medical imaging)?
-- **Fixed rank**: $k{=}20$ was chosen as a round number capturing ~70% variance — not tuned. A learned or input-dependent rank could improve the trade-off.
-- **Scope**: This is a mechanistic study on MNIST-scale data, not a production defense. Scaling to higher-resolution images would require rethinking the SVD layer.
 ---

 pinned: false
 ---
+# SVD vs CNN: Mechanistic Analysis of Manifold Alignment on MNIST
+[![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/ymlin105/Coconut-MNIST) [![Full Report](https://img.shields.io/badge/📖_Read-Full_Report-blue)](./docs/REPORT.md)
+While it is a known theoretical property that linear dimensionality reduction (SVD) acts as a low-pass filter, this project provides a **concrete, visual, and quantitative mechanistic explanation** of how this property manifests in neural network classification—specifically, why linear subspaces consistently force a "3" to collapse into an "8".
+<p align="center">
+  <img src="./docs/research_results/fig_06_explainability.png" width="600" alt="Mechanistic Analysis of SVD Inductive Bias">
+</p>
+By mapping the exact decision boundaries where linear global variance models fail and non-linear topological models (CNNs) succeed, I empirically validate the **inherent trade-offs** of linear denoising in high-stakes domains like medical imaging or satellite data—where a linear filter might suppress critical diagnostic features to minimize noise variance.
+## The Solution: Hybrid SVD-CNN
+I combine SVD's strength as a data-adapted low-pass filter with the CNN's robust feature extraction into a single pipeline.
+```mermaid
+flowchart TD
+    subgraph S1 [I. Noisy Manifold]
+        direction LR
+        X["Input $X + \eta$"]
+    end
+    subgraph S2 [II. Adaptive Projection]
+        direction LR
+        node_SVD["SVD: $X = U \Sigma V^T$"]
+        node_Trunc["$k$-Rank Truncation"]
+        node_Recon["$\hat{X} = \sum \sigma_i u_i v_i^T$"]
+        node_SVD --> node_Trunc --> node_Recon
+    end
+    subgraph S3 [III. CNN Features]
+        direction LR
+        node_Conv["Conv Layers"] --> node_Pool["Pooling / ReLU"] --> node_Flat["Global Flatten"]
+    end
+    subgraph S4 [IV. Latent Mapping]
+        direction LR
+        node_Soft["Logits / Softmax"] --> node_Pred["Class Prediction"]
+    end
+    S1 --> S2
+    S2 --> S3
+    S3 --> S4
+    style S2 fill:#f8f9ff,stroke:#0056b3,stroke-width:2px
+    style S3 fill:#f8fff9,stroke:#28a745,stroke-width:2px
+    style S1 fill:#fff,stroke:#333
+    style S4 fill:#fff,stroke:#333
 ```
+### Key Takeaways
+For full analysis and detailed metrics, see the [Technical Report](./docs/REPORT.md).
+1. **The Variance Trap**: Important details (like the gap in a "3") have very little pixel variance. SVD-based linear projections clear them away as noise, forcing distinct digit manifolds to overlap and causing systematic "3-as-8" hallucinations.
+2. **Local Logic**: UMAP analysis demonstrates that manifolds are topologically distinct when local structure is preserved, but linear variance optimization destroys this neighborhood integrity.
+3. **Hybrid Advantage**: In high-noise environments ($\sigma=0.7$), a Hybrid architecture acts as a data-adapted denoiser, outperforming standalone CNNs by +4.8 pp.
+4. **The Boundary**: On texture-rich data (e.g., Fashion-MNIST), SVD reconstruction destroys critical high-frequency features, defining the physical limit of linear denoising.
+---
+## Experience it Yourself
+### Online Demo
+Try the live dashboard to inject noise, adjust SVD rank, and compare model predictions in real-time:
+**[Launch Streamlit App](https://huggingface.co/spaces/ymlin105/Coconut-MNIST)**.
+### Local Installation
+```bash
+# Clone the repository
+git clone https://github.com/ymlin105/mnist-linear-vs-nonlinear.git
+cd mnist-linear-vs-nonlinear
+# Install dependencies
+pip install -r requirements.txt
+# Launch the interactive dashboard
+streamlit run app.py
+```
+### Project Structure
+```
+├── src/               Core modules (CNN, SVD layer) + Experimental Utils
+├── experiments/       Theme-based scripts (01 Diagnosis, 02 Analysis, 03 Robustness)
+├── docs/              Full report (REPORT.md) + figures
+├── models/            Pretrained checkpoints
+├── run_all_experiments.sh  One-click reproduction script
+└── app.py             Streamlit dashboard
+```
 ---

experiments/01_phenomenon_diagnosis.py ADDED Viewed

	@@ -0,0 +1,91 @@

+"""
+Exp 01: Phenomenon Diagnosis
+Combines Global/Focused SVD analysis with CNN baseline comparisons.
+Refactored to use centralized utility modules.
+"""
+import torch
+import numpy as np
+from sklearn.decomposition import TruncatedSVD
+from sklearn.linear_model import LogisticRegression
+from sklearn.metrics import accuracy_score
+from src import config, utils, viz, exp_utils
+def run_svd_analysis(X_train, y_train, X_test, y_test):
+    print("\n--- Running SVD Spectral Analysis ---")
+    mean = np.mean(X_train, axis=0)
+    X_centered = X_train - mean
+    n_view = 300
+    svd = TruncatedSVD(n_components=n_view, random_state=42)
+    svd.fit(X_centered)
+    # 1. Visualization: Spectrum
+    viz.plot_singular_spectrum(
+        svd.singular_values_,
+        np.cumsum(svd.explained_variance_ratio_),
+        'fig_01_spectrum.png'
+    )
+    # 2. Classification with k=20
+    svd_20 = TruncatedSVD(n_components=20, random_state=42)
+    X_train_pca = svd_20.fit_transform(X_train - mean)
+    X_test_pca = svd_20.transform(X_test - mean)
+    clf = LogisticRegression(max_iter=1000)
+    clf.fit(X_train_pca, y_train)
+    y_pred = clf.predict(X_test_pca)
+    acc = accuracy_score(y_test, y_pred)
+    print(f"SVD (k=20) Accuracy: {acc*100:.2f}%")
+    # 3. Visualization: Confusion Matrix & Eigen-digits
+    viz.plot_confusion_matrix(
+        y_test, y_pred, list(range(10)),
+        'fig_02_svd_confusion.png',
+        f'SVD Confusion Matrix (Acc={acc:.2f})',
+        viz.COLOR_SVD
+    )
+    component_titles = [f"Comp {i+1}" for i in range(10)]
+    viz.plot_multi_image_grid(
+        [c.reshape(28, 28) for c in svd_20.components_[:10]],
+        component_titles, 2, 5,
+        'fig_03_eigen_digits.png',
+        'Global SVD Eigen-digits'
+    )
+def run_cnn_baseline(device):
+    print("\n--- Running CNN Baseline Diagnosis ---")
+    svd_p, cnn = utils.load_models(dataset_name="mnist")
+    X_test, y_test = utils.load_data_split(dataset_name="mnist", train=False)
+    acc = exp_utils.evaluate_classifier(cnn, X_test, y_test, device=device, is_pytorch=True)
+    print(f"CNN Accuracy: {acc*100:.2f}%")
+    # Predict for confusion matrix
+    cnn.eval()
+    with torch.no_grad():
+        preds = cnn(X_test.to(device)).argmax(dim=1).cpu().numpy()
+    viz.plot_confusion_matrix(
+        y_test.numpy(), preds, list(range(10)),
+        'fig_04_cnn_confusion.png',
+        f'CNN Confusion Matrix (Acc={acc:.2f})',
+        viz.COLOR_CNN
+    )
+def main():
+    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+    # Load full MNIST (flattened for SVD)
+    X_train, y_train = utils.load_data_split(dataset_name="mnist", train=True, flatten=True)
+    X_test, y_test = utils.load_data_split(dataset_name="mnist", train=False, flatten=True)
+    run_svd_analysis(X_train.numpy(), y_train.numpy(), X_test.numpy(), y_test.numpy())
+    run_cnn_baseline(device)
+    print("\nExperiment 01 Diagnosis Completed.")
+if __name__ == "__main__":
+    main()

experiments/01_phenomenon_discovery.py DELETED Viewed

@@ -1,226 +0,0 @@
-import numpy as np
-import matplotlib.pyplot as plt
-import seaborn as sns
-from sklearn.decomposition import TruncatedSVD
-from sklearn.metrics import confusion_matrix, accuracy_score
-from sklearn.linear_model import LogisticRegression
-import torchvision
-import torchvision.transforms as transforms
-import torch
-import os
-from matplotlib.colors import LinearSegmentedColormap
-from src import config
-# --- Configuration ---
-GRAY_LIGHT = "#D8DEE9"
-BLUE_DEEP = "#5E81AC"
-ORANGE = "#D08770"
-def load_mnist():
-    """Load and flatten MNIST data."""
-    print("Loading MNIST...")
-    transform = transforms.Compose([transforms.ToTensor()])
-    # Download to ./data/mnist
-    trainset = torchvision.datasets.MNIST(root=config.MNIST_DIR, train=True, download=True, transform=transform)
-    testset = torchvision.datasets.MNIST(root=config.MNIST_DIR, train=False, download=True, transform=transform)
-    # Flatten: (N, 28, 28) -> (N, 784)
-    X_train = trainset.data.numpy().reshape(-1, 784).astype(np.float32) / 255.0
-    y_train = trainset.targets.numpy()
-    X_test = testset.data.numpy().reshape(-1, 784).astype(np.float32) / 255.0
-    y_test = testset.targets.numpy()
-    return X_train, y_train, X_test, y_test
-def plot_confusion_matrix(y_true, y_pred, labels, filename, title):
-    """Draws and saves a confusion matrix (normalized)."""
-    cm = confusion_matrix(y_true, y_pred, normalize='true') # Normalize by true class (rows)
-    plt.figure(figsize=(10, 8))
-    cmap = LinearSegmentedColormap.from_list("NBodyBlue", [GRAY_LIGHT, BLUE_DEEP])
-    sns.heatmap(cm, annot=True, fmt='.1%', cmap=cmap, xticklabels=labels, yticklabels=labels)
-    plt.title(title)
-    plt.xlabel('Predicted')
-    plt.ylabel('True')
-    plt.savefig(os.path.join(config.RESULTS_DIR, filename))
-    plt.close()
-    print(f"Saved {filename}")
-def plot_eigen_digits(components, filename, title):
-    """Visualizes the top eigen-digits."""
-    plt.figure(figsize=(12, 4))
-    for i in range(min(10, len(components))): # Show top 10
-        plt.subplot(2, 5, i + 1)
-        plt.imshow(components[i].reshape(28, 28), cmap='gray')
-        plt.title(f"Comp {i+1}")
-        plt.axis('off')
-    plt.suptitle(title)
-    plt.savefig(os.path.join(config.RESULTS_DIR, filename))
-    plt.close()
-    print(f"Saved {filename}")
-def analyze_spectrum(X, filename, title):
-    """
-    Computes and plots the singular value spectrum.
-    Returns cumulative variance stats.
-    """
-    print(f"\nRunning Spectral Analysis on shape {X.shape}...")
-    # Center the data
-    X_mean = np.mean(X, axis=0)
-    X_centered = X - X_mean
-    # Compute full SVD (approximation with high k)
-    n_view = 300 # Increased from 50 to capture >90% variance
-    svd = TruncatedSVD(n_components=n_view, random_state=42)
-    svd.fit(X_centered)
-    singular_values = svd.singular_values_
-    explained_variance_ratio = svd.explained_variance_ratio_
-    cumulative_variance = np.cumsum(explained_variance_ratio)
-    # Quantify stats
-    var_k10 = cumulative_variance[9] * 100
-    def get_k(threshold):
-        idx = np.argmax(cumulative_variance >= threshold)
-        if cumulative_variance[idx] < threshold: return f">{n_view}"
-        return idx + 1
-    k_90 = get_k(0.90)
-    k_95 = get_k(0.95)
-    k_99 = get_k(0.99)
-    print(f"Spectral Stats:")
-    print(f"  Variance @ k=10: {var_k10:.2f}%")
-    print(f"  Components for 90% Var: k={k_90}")
-    print(f"  Components for 95% Var: k={k_95}")
-    print(f"  Components for 99% Var: k={k_99}")
-    # Plot Scree
-    fig, ax1 = plt.subplots(figsize=(10, 6))
-    color = BLUE_DEEP
-    ax1.set_xlabel('Principal Component (k)')
-    ax1.set_ylabel('Singular Value (Log Scale)', color=color)
-    ax1.semilogy(range(1, n_view+1), singular_values, marker='o', linestyle='-', color=color, markersize=4, label='Singular Values')
-    ax1.tick_params(axis='y', labelcolor=color)
-    ax1.grid(True, which="both", ls="-", alpha=0.3)
-    ax2 = ax1.twinx()  # instantiate a second axes that shares the same x-axis
-    color = ORANGE
-    ax2.set_ylabel('Cumulative Explained Variance', color=color)  # we already handled the x-label with ax1
-    ax2.plot(range(1, n_view+1), cumulative_variance, color=color, linewidth=2, linestyle='--', label='Cumulative Variance')
-    ax2.tick_params(axis='y', labelcolor=color)
-    ax2.set_ylim(0, 1.0)
-    # Annotate k=10
-    ax2.axvline(x=10, color='gray', linestyle=':', alpha=0.5)
-    ax2.text(10.5, 0.4, f'k=10\n({var_k10:.1f}%)', color='black')
-    plt.title(title)
-    plt.savefig(os.path.join(config.RESULTS_DIR, filename))
-    plt.close()
-    print(f"Saved {filename}")
-def run_experiment_0(X_train, y_train, X_test, y_test):
-    """
-    Experiment 0: Global SVD Analysis (10 Classes)
-    Hypothesis: 3 and 8 shows significant confusion.
-    """
-    print("\n--- Running Experiment 0: Global SVD (10 Classes) ---")
-    # 1. SVD Reduction
-    n_components = 20  # low rank to force dependency on main variance directions
-    print(f"Reducing dimension to {n_components} using SVD...")
-    # Mean-center for consistency with hybrid model's SVD layer
-    mean = np.mean(X_train, axis=0)
-    X_train_centered = X_train - mean
-    X_test_centered = X_test - mean
-    svd = TruncatedSVD(n_components=n_components, random_state=42)
-    X_train_pca = svd.fit_transform(X_train_centered)
-    X_test_pca = svd.transform(X_test_centered)
-    # 2. Classification (Simple Linear or KNN)
-    # Using Logistic Regression to simulate linear classification on SVD features
-    clf = LogisticRegression(max_iter=1000)
-    clf.fit(X_train_pca, y_train)
-    y_pred = clf.predict(X_test_pca)
-    acc = accuracy_score(y_test, y_pred)
-    print(f"Global SVD+LR Accuracy: {acc*100:.2f}%")
-    # 3. Plot Confusion Matrix
-    plot_confusion_matrix(y_test, y_pred, list(range(10)),
-                         'fig_01_global_svd_confusion.png',
-                         f'Global SVD Confusion Matrix (k={n_components}, Acc={acc:.2f})')
-    # Analyze specific confusion between 3 and 8
-    idxs_3 = (y_test == 3)
-    idxs_8 = (y_test == 8)
-    # Confusion 3->8
-    pred_3 = y_pred[idxs_3]
-    confused_3_as_8 = np.sum(pred_3 == 8)
-    print(f"Class 3 samples classified as 8: {confused_3_as_8} / {len(pred_3)} ({confused_3_as_8/len(pred_3)*100:.2f}%)")
-    # Confusion 8->3
-    pred_8 = y_pred[idxs_8]
-    confused_8_as_3 = np.sum(pred_8 == 3)
-    print(f"Class 8 samples classified as 3: {confused_8_as_3} / {len(pred_8)} ({confused_8_as_3/len(pred_8)*100:.2f}%)")
-def run_experiment_1(X_train, y_train, X_test, y_test):
-    """
-    Experiment 1: Focused 3 vs 8 SVD Analysis
-    """
-    print("\n--- Running Experiment 1: Focused SVD (3 vs 8) ---")
-    # 1. Filter Data
-    mask_train = np.logical_or(y_train == 3, y_train == 8)
-    mask_test = np.logical_or(y_test == 3, y_test == 8)
-    X_train_38 = X_train[mask_train]
-    y_train_38 = y_train[mask_train]
-    X_test_38 = X_test[mask_test]
-    y_test_38 = y_test[mask_test]
-    print(f"Train samples (3 vs 8): {len(y_train_38)}")
-    print(f"Test samples (3 vs 8): {len(y_test_38)}")
-    # 2. SVD on Subset
-    n_components = 10
-    # Mean-center for consistency with hybrid model's SVD layer
-    mean_38 = np.mean(X_train_38, axis=0)
-    X_train_38_centered = X_train_38 - mean_38
-    X_test_38_centered = X_test_38 - mean_38
-    svd = TruncatedSVD(n_components=n_components, random_state=42)
-    X_train_pca = svd.fit_transform(X_train_38_centered)
-    X_test_pca = svd.transform(X_test_38_centered)
-    # 3. Classify
-    clf = LogisticRegression()
-    clf.fit(X_train_pca, y_train_38)
-    y_pred = clf.predict(X_test_pca)
-    acc = accuracy_score(y_test_38, y_pred)
-    print(f"Focused SVD(k={n_components}) Accuracy (3 vs 8): {acc*100:.2f}%")
-    # 4. Plots
-    plot_eigen_digits(svd.components_,
-                     'fig_03_eigen_digits.png',
-                     'Top 10 Eigen-digits (Principal Components of 3&8)')
-    # 5. Spectral Analysis (New)
-    analyze_spectrum(X_train_38, 'fig_02_scree_plot.png', 'Singular Value Spectrum (3 vs 8 Subset)')
-def main():
-    X_train, y_train, X_test, y_test = load_mnist()
-    run_experiment_0(X_train, y_train, X_test, y_test)
-    run_experiment_1(X_train, y_train, X_test, y_test)
-    print("\nExperiments 0 & 1 Completed.")
-    print(f"Results saved to {config.RESULTS_DIR}")
-if __name__ == "__main__":
-    main()

experiments/02_mechanistic_analysis.py ADDED Viewed

	@@ -0,0 +1,102 @@

+"""
+Exp 02: Mechanistic Analysis
+Combines Interpolation, Explainability (Grad-CAM), and Quantifying Manifold Collapse.
+Refactored for modularity.
+"""
+import torch
+import torch.nn as nn
+import numpy as np
+from sklearn.decomposition import TruncatedSVD
+from sklearn.neighbors import KNeighborsClassifier
+from sklearn.metrics import accuracy_score
+from src import config, utils, viz, exp_utils
+def run_interpolation_analysis(device):
+    print("\n--- Running Mechanistic Proof: The Variance vs. Topology Conflict ---")
+    X_test, y_test = utils.load_data_split(dataset_name="mnist", train=False, digits=[3, 8])
+    _, cnn = utils.load_models(dataset_name="mnist")
+    # Fit SVD baseline for reconstruction analysis
+    X_test_flat = X_test.view(X_test.size(0), -1).numpy()
+    svd_pipe = exp_utils.fit_svd_baseline(X_test_flat, y_test.numpy(), n_components=10)
+    svd = svd_pipe.named_steps['svd']
+    mean = svd_pipe.named_steps['scaler'].mean_
+    # Pick indices for digit 3 and 8
+    idx_3 = (y_test == 0).nonzero()[0][0]
+    idx_8 = (y_test == 1).nonzero()[0][0]
+    img_3, img_8 = X_test[idx_3], X_test[idx_8]
+    alphas = np.linspace(0, 1, 11)
+    probs_8, rec_errors = [], []
+    for alpha in alphas:
+        img_interp = (1 - alpha) * img_3 + alpha * img_8
+        # CNN Probability of class 1 (Digit 8)
+        with torch.no_grad():
+            logits = cnn(img_interp.unsqueeze(0).to(device))
+            # Note: We use index 8 from full model or index 1 if it was binary
+            # Here we assume full model but we load 3v8 subset.
+            # If model is 10-class, we need to pick actual digit indices.
+            # Let's check model output size.
+            out_dim = logits.shape[1]
+            if out_dim == 10:
+                p = torch.softmax(logits, dim=1)[0, 8].item()
+            else:
+                p = torch.softmax(logits, dim=1)[0, 1].item()
+            probs_8.append(p)
+        # SVD Reconstruction Error
+        flat = img_interp.view(1, -1).numpy()
+        rec = svd.inverse_transform(svd.transform(flat - mean)) + mean
+        rec_errors.append(np.linalg.norm(flat - rec))
+    viz.plot_interpolation_dynamics(alphas, probs_8, rec_errors, 'fig_05_interpolation.png')
+def run_quantifying_manifold_collapse():
+    print("\n--- Running Experiment 7: Quantifying Manifold Collapse ---")
+    X_train, y_train = utils.load_data_split(dataset_name="mnist", train=True, digits=[3, 8], flatten=True)
+    X_test, y_test = utils.load_data_split(dataset_name="mnist", train=False, digits=[3, 8], flatten=True)
+    X_train_np, y_train_np = X_train.numpy(), y_train.numpy()
+    X_test_np, y_test_np = X_test.numpy(), y_test.numpy()
+    # 1. k-NN on Raw Pixel Space (784D)
+    knn_raw = KNeighborsClassifier(n_neighbors=5)
+    knn_raw.fit(X_train_np, y_train_np)
+    acc_raw = accuracy_score(y_test_np, knn_raw.predict(X_test_np))
+    # 2. k-NN on SVD-reduced Space (10D)
+    svd = TruncatedSVD(n_components=10, random_state=42)
+    X_train_svd = svd.fit_transform(X_train_np)
+    X_test_svd = svd.transform(X_test_np)
+    knn_svd = KNeighborsClassifier(n_neighbors=5)
+    knn_svd.fit(X_train_svd, y_train_np)
+    acc_svd = accuracy_score(y_test_np, knn_svd.predict(X_test_svd))
+    # 3. Visualization: SVD vs UMAP
+    try:
+        import umap
+        reducer = umap.UMAP(n_neighbors=15, min_dist=0.1, n_components=2, random_state=42)
+        X_umap = reducer.fit_transform(X_test_np)
+        viz.plot_manifold_comparison(X_test_svd, X_umap, y_test_np, acc_svd, acc_raw, 'fig_08_manifold_collapse.png')
+    except Exception as e:
+        print(f"Warning: Manifold visualization failed: {e}")
+    print(f"Manifold Collapse Quantification Results:")
+    print(f" - Raw 784D k-NN Accuracy: {acc_raw:.4f}")
+    print(f" - SVD 10D k-NN Accuracy:  {acc_svd:.4f}")
+    print(f" - Accuracy Loss:           {acc_raw - acc_svd:.4f}")
+def main():
+    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+    run_interpolation_analysis(device)
+    run_quantifying_manifold_collapse()
+    print("\nExperiment 02 Mechanistic Analysis (Refined) Completed.")
+if __name__ == "__main__":
+    main()

experiments/02_mnist_cnn_confusion.py DELETED Viewed

@@ -1,68 +0,0 @@
-# Exp 02 – MNIST 10-class CNN confusion matrix
-import os
-import numpy as np
-import matplotlib.pyplot as plt
-import seaborn as sns
-import torch
-from sklearn.metrics import confusion_matrix, accuracy_score
-from torchvision import datasets, transforms
-from matplotlib.colors import LinearSegmentedColormap
-from src.hybrid_model import SimpleCNN
-from src import config
-GRAY_LIGHT = "#D8DEE9"
-BLUE_LIGHT = "#88C0D0"
-def load_mnist_test():
-    transform = transforms.Compose([transforms.ToTensor()])
-    testset = datasets.MNIST(root=config.MNIST_DIR, train=False, download=True, transform=transform)
-    X_test = testset.data.float() / 255.0
-    y_test = testset.targets
-    X_test = X_test.unsqueeze(1)  # (N, 1, 28, 28)
-    return X_test, y_test
-def main():
-    # Load model
-    model = SimpleCNN(num_classes=10)
-    model.load_state_dict(torch.load(config.CNN_MODEL_PATH, map_location="cpu"))
-    model.eval()
-    # Load data
-    X_test, y_test = load_mnist_test()
-    with torch.no_grad():
-        logits = model(X_test)
-        preds = torch.argmax(logits, dim=1)
-    y_true = y_test.numpy()
-    y_pred = preds.numpy()
-    acc = accuracy_score(y_true, y_pred)
-    print(f"MNIST CNN Accuracy: {acc*100:.2f}%")
-    cm = confusion_matrix(y_true, y_pred, normalize="true")
-    plt.figure(figsize=(10, 8))
-    cmap = LinearSegmentedColormap.from_list(
-        "NBodyCNN",
-        [GRAY_LIGHT, BLUE_LIGHT],
-    )
-    sns.heatmap(cm, annot=True, fmt=".1%", cmap=cmap, xticklabels=list(range(10)), yticklabels=list(range(10)))
-    plt.title(f"MNIST CNN Confusion Matrix (Acc={acc:.2%})")
-    plt.xlabel("Predicted")
-    plt.ylabel("True")
-    plt.tight_layout()
-    out_path = os.path.join(config.RESULTS_DIR, "fig_04_mnist_cnn_confusion.png")
-    plt.savefig(out_path, dpi=300)
-    plt.close()
-    print(f"Saved {out_path}")
-if __name__ == "__main__":
-    main()

experiments/03_mechanistic_investigation.py DELETED Viewed

@@ -1,244 +0,0 @@
-import torch
-import torch.nn as nn
-import torch.optim as optim
-from torch.utils.data import TensorDataset, DataLoader
-import numpy as np
-import matplotlib.pyplot as plt
-from sklearn.decomposition import TruncatedSVD
-import cv2
-import os
-import ssl
-import torchvision
-from src.hybrid_model import SimpleCNN
-from src import config
-# --- Configuration ---
-BLUE_LIGHT = "#88C0D0"
-BLUE_DEEP = "#5E81AC"
-BATCH_SIZE = 64
-EPOCHS = 5
-def load_mnist_38():
-    """Load MNIST and filter for 3 vs 8."""
-    ssl._create_default_https_context = ssl._create_unverified_context
-    transform = torchvision.transforms.Compose([torchvision.transforms.ToTensor()])
-    trainset = torchvision.datasets.MNIST(root=config.MNIST_DIR, train=True, download=True, transform=transform)
-    testset = torchvision.datasets.MNIST(root=config.MNIST_DIR, train=False, download=True, transform=transform)
-    def filter_38(dataset):
-        mask = (dataset.targets == 3) | (dataset.targets == 8)
-        data = dataset.data[mask].unsqueeze(1).float() / 255.0
-        targets = dataset.targets[mask]
-        targets = torch.where(targets == 3, torch.tensor(0), torch.tensor(1))
-        return data, targets
-    X_train, y_train = filter_38(trainset)
-    X_test, y_test = filter_38(testset)
-    return X_train, y_train, X_test, y_test
-# --- Training Helper ---
-def train_model(X_train, y_train):
-    model = SimpleCNN(num_classes=2)
-    criterion = nn.CrossEntropyLoss()
-    optimizer = optim.Adam(model.parameters(), lr=0.001)
-    dataset = TensorDataset(X_train, y_train)
-    loader = DataLoader(dataset, batch_size=BATCH_SIZE, shuffle=True)
-    print("Training CNN for Analysis...")
-    model.train()
-    for epoch in range(EPOCHS):
-        for inputs, labels in loader:
-            optimizer.zero_grad()
-            outputs = model(inputs)
-            loss = criterion(outputs, labels)
-            loss.backward()
-            optimizer.step()
-    return model
-# --- Experiment 3: Interpolation ---
-def run_interpolation_analysis(model, svd, X_test, y_test, svd_mean=None):
-    print("\n--- Running Exp 3: Interpolation Analysis ---")
-    # 1. Find a good pairs of 3 and 8
-    # We want a 'canonical' 3 and 8
-    idx_3 = (y_test == 0).nonzero(as_tuple=True)[0][0]
-    idx_8 = (y_test == 1).nonzero(as_tuple=True)[0][0]
-    img_3 = X_test[idx_3] # (1, 28, 28)
-    img_8 = X_test[idx_8]
-    # 2. Generate Interpolation steps
-    alphas = np.linspace(0, 1, 11)
-    interpolated_imgs = []
-    cnn_probs_8 = []
-    svd_errors = []
-    print("Computing metrics along interpolation path...")
-    for alpha in alphas:
-        # Linear Interpolation
-        img_interp = (1 - alpha) * img_3 + alpha * img_8
-        interpolated_imgs.append(img_interp.squeeze().numpy())
-        # CNN Prediction
-        with torch.no_grad():
-            img_tensor = img_interp.unsqueeze(0) # (1, 1, 28, 28)
-            logits = model(img_tensor)
-            probs = torch.softmax(logits, dim=1)
-            cnn_probs_8.append(probs[0, 1].item())
-        # SVD Reconstruction (using the passed svd model)
-        # SVD expects (N, 784), mean-centered to match training
-        img_flat = img_interp.view(1, -1).numpy()
-        img_centered = img_flat - svd_mean if svd_mean is not None else img_flat
-        img_pca = svd.transform(img_centered)
-        img_rec = svd.inverse_transform(img_pca)
-        if svd_mean is not None:
-            img_rec = img_rec + svd_mean
-        # Reconstruction Error (L2)
-        rec_err = np.linalg.norm(img_flat - img_rec)
-        svd_errors.append(rec_err)
-    # 3. Plotting
-    plt.figure(figsize=(12, 6))
-    # Plot Images
-    for i, img in enumerate(interpolated_imgs):
-        plt.subplot(3, 11, i + 1)
-        plt.imshow(img, cmap='gray')
-        plt.axis('off')
-        if i == 0: plt.title("Start (3)")
-        if i == 10: plt.title("End (8)")
-    # Plot Curves
-    plt.subplot(3, 1, 2)
-    plt.plot(alphas, cnn_probs_8, marker='o', color=BLUE_LIGHT, label='CNN Prob(Class=8)')
-    plt.ylabel('CNN Probability')
-    plt.grid(True)
-    plt.legend()
-    plt.subplot(3, 1, 3)
-    plt.plot(alphas, svd_errors, marker='s', color=BLUE_DEEP, label='SVD Rec. Error')
-    plt.xlabel('Interpolation Alpha (0=3, 1=8)')
-    plt.ylabel('SVD Error')
-    plt.grid(True)
-    plt.legend()
-    plt.tight_layout()
-    plt.savefig(os.path.join(config.RESULTS_DIR, 'fig_05_interpolation_analysis.png'))
-    plt.close()
-    print("Saved fig_05_interpolation_analysis.png")
-    return interpolated_imgs # Return for Exp 4 use
-# --- Experiment 4: Explainability (Grad-CAM vs SVD) ---
-def run_explainability_analysis(model, svd, interpolated_imgs, svd_mean=None):
-    print("\n--- Running Exp 4: Explainability Analysis ---")
-    # Pick the middle ambiguous image (alpha=0.5)
-    middle_idx = 5
-    img_ambiguous = torch.tensor(interpolated_imgs[middle_idx]).unsqueeze(0).unsqueeze(0) # (1, 1, 28, 28)
-    # 1. CNN Grad-CAM
-    # Hook into last conv layer
-    gradients = []
-    activations = []
-    def backward_hook(module, grad_input, grad_output):
-        gradients.append(grad_output[0])
-    def forward_hook(module, input, output):
-        activations.append(output)
-    # Register hooks on conv2
-    handle_b = model.conv2.register_full_backward_hook(backward_hook)
-    handle_f = model.conv2.register_forward_hook(forward_hook)
-    # Forward & Backward
-    model.eval()
-    logits = model(img_ambiguous)
-    # Target class 8 (index 1) for visualization
-    logits[0, 1].backward()
-    # Generate Heatmap
-    grads = gradients[0].cpu().data.numpy()[0] # (32, 7, 7)
-    fmaps = activations[0].cpu().data.numpy()[0] # (32, 7, 7)
-    weights = np.mean(grads, axis=(1, 2)) # Global Average Pooling
-    cam = np.zeros(fmaps.shape[1:], dtype=np.float32)
-    for i, w in enumerate(weights):
-        cam += w * fmaps[i]
-    cam = np.maximum(cam, 0)
-    cam = cv2.resize(cam, (28, 28))
-    cam = cam - np.min(cam)
-    cam = cam / np.max(cam)
-    # 2. SVD Reconstruction
-    img_flat = img_ambiguous.view(1, -1).numpy()
-    img_centered = img_flat - svd_mean if svd_mean is not None else img_flat
-    img_pca = svd.transform(img_centered)
-    img_rec = svd.inverse_transform(img_pca)
-    if svd_mean is not None:
-        img_rec = img_rec + svd_mean
-    img_rec = img_rec.reshape(28, 28)
-    # 3. Plot Comparison
-    plt.figure(figsize=(10, 4))
-    # Original Ambiguous
-    plt.subplot(1, 3, 1)
-    plt.imshow(img_ambiguous.squeeze(), cmap='gray')
-    plt.title("Ambiguous Input (Alpha=0.5)")
-    plt.axis('off')
-    # CNN Attention
-    plt.subplot(1, 3, 2)
-    plt.imshow(img_ambiguous.squeeze(), cmap='gray')
-    plt.imshow(cam, cmap='jet', alpha=0.5) # Overlay
-    plt.title("CNN Attention (Grad-CAM)")
-    plt.axis('off')
-    # SVD Reconstruction
-    plt.subplot(1, 3, 3)
-    plt.imshow(img_rec, cmap='gray')
-    plt.title("SVD Reconstruction")
-    plt.axis('off')
-    plt.tight_layout()
-    plt.savefig(os.path.join(config.RESULTS_DIR, 'fig_06_explainability.png'))
-    plt.close()
-    print("Saved fig_06_explainability.png")
-    handle_b.remove()
-    handle_f.remove()
-def main():
-    # Load Data
-    X_train_tensor, y_train_tensor, X_test_tensor, y_test_tensor = load_mnist_38()
-    # Train CNN
-    cnn_model = train_model(X_train_tensor, y_train_tensor)
-    # Fit SVD (on train data 3 vs 8)
-    print("Fitting SVD on 3 vs 8...")
-    X_train_np = X_train_tensor.view(-1, 784).numpy()
-    # Mean-center for consistency with hybrid model's SVD layer
-    svd_mean = np.mean(X_train_np, axis=0)
-    X_train_centered = X_train_np - svd_mean
-    svd = TruncatedSVD(n_components=10, random_state=42)
-    svd.fit(X_train_centered)
-    # Run Experiments
-    interp_imgs = run_interpolation_analysis(cnn_model, svd, X_test_tensor, y_test_tensor, svd_mean)
-    run_explainability_analysis(cnn_model, svd, interp_imgs, svd_mean)
-    print("\nDeep Dive Analysis Completed.")
-if __name__ == "__main__":
-    main()

experiments/04_robustness_limit.py DELETED Viewed

@@ -1,187 +0,0 @@
-import torch
-import torch.nn as nn
-import torch.optim as optim
-from torch.utils.data import TensorDataset, DataLoader
-import numpy as np
-import matplotlib.pyplot as plt
-from sklearn.decomposition import TruncatedSVD
-from sklearn.linear_model import LogisticRegression
-import torchvision
-import torchvision.transforms as transforms
-import os
-from src import config
-# --- Configuration ---
-BLUE_LIGHT = "#88C0D0"
-BLUE_DEEP = "#5E81AC"
-BATCH_SIZE = 64
-# --- Model (Same as before) ---
-from src.hybrid_model import SimpleCNN
-# --- Comparison Evaluation ---
-def load_data():
-    transform = transforms.Compose([transforms.ToTensor()])
-    trainset = torchvision.datasets.MNIST(root=config.MNIST_DIR, train=True, download=True, transform=transform)
-    testset = torchvision.datasets.MNIST(root=config.MNIST_DIR, train=False, download=True, transform=transform)
-    def filter_38(dataset):
-        mask = (dataset.targets == 3) | (dataset.targets == 8)
-        data = dataset.data[mask].unsqueeze(1).float() / 255.0
-        targets = dataset.targets[mask]
-        targets = torch.where(targets == 3, torch.tensor(0), torch.tensor(1))
-        return data, targets
-    X_train, y_train = filter_38(trainset)
-    X_test, y_test = filter_38(testset)
-    return X_train, y_train, X_test, y_test
-def add_noise(images, noise_level):
-    """Add Gaussian noise."""
-    noise = torch.randn_like(images) * noise_level
-    noisy_imgs = images + noise
-    return torch.clamp(noisy_imgs, 0, 1)
-def add_blur(images, kernel_size):
-    """Add Gaussian blur."""
-    if kernel_size <= 1: return images
-    blur_fn = transforms.GaussianBlur(kernel_size=(kernel_size, kernel_size), sigma=(0.1 + 0.3 * (kernel_size//2)))
-    return blur_fn(images)
-def evaluate_models(cnn, svd_clf, svd_transform, X_test, y_test, svd_mean=None):
-    # CNN Eval
-    cnn.eval()
-    with torch.no_grad():
-        logits = cnn(X_test)
-        preds = torch.argmax(logits, dim=1)
-        cnn_acc = (preds == y_test).float().mean().item()
-    # SVD Eval
-    X_flat = X_test.view(X_test.size(0), -1).numpy()
-    if svd_mean is not None:
-        X_flat = X_flat - svd_mean
-    X_pca = svd_transform.transform(X_flat)
-    y_pred_svd = svd_clf.predict(X_pca)
-    svd_acc = np.mean(y_pred_svd == y_test.numpy())
-    return cnn_acc, svd_acc
-def main():
-    print("Loading Data...")
-    X_train, y_train, X_test, y_test = load_data()
-    # 1. Train Models (Quickly)
-    print("Training CNN Baseline...")
-    cnn = SimpleCNN(num_classes=2)
-    optimizer = optim.Adam(cnn.parameters(), lr=0.001)
-    criterion = nn.CrossEntropyLoss()
-    dataset = TensorDataset(X_train, y_train)
-    loader = DataLoader(dataset, batch_size=64, shuffle=True)
-    cnn.train()
-    for _ in range(3): # 3 Epochs enough for 99%
-        for x, y in loader:
-            optimizer.zero_grad()
-            loss = criterion(cnn(x), y)
-            loss.backward()
-            optimizer.step()
-    print("Training SVD Baseline...")
-    X_train_flat = X_train.view(X_train.size(0), -1).numpy()
-    # Mean-center for consistency with hybrid model's SVD layer
-    svd_mean = np.mean(X_train_flat, axis=0)
-    X_train_centered = X_train_flat - svd_mean
-    svd = TruncatedSVD(n_components=10, random_state=42)
-    X_train_pca = svd.fit_transform(X_train_centered)
-    clf = LogisticRegression(max_iter=500)
-    clf.fit(X_train_pca, y_train.numpy())
-    # 2. Noise Experiment
-    noise_levels = [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8]
-    cnn_accs_noise = []
-    svd_accs_noise = []
-    print("Running Noise Experiments...")
-    for nl in noise_levels:
-        X_test_noisy = add_noise(X_test, nl)
-        ca, sa = evaluate_models(cnn, clf, svd, X_test_noisy, y_test, svd_mean)
-        cnn_accs_noise.append(ca)
-        svd_accs_noise.append(sa)
-        print(f"Noise {nl}: CNN={ca:.2f}, SVD={sa:.2f}")
-    # 3. Blur Experiment
-    blur_kernels = [1, 3, 5, 7, 9, 11]
-    cnn_accs_blur = []
-    svd_accs_blur = []
-    print("Running Blur Experiments...")
-    for k in blur_kernels:
-        X_test_blur = add_blur(X_test, k)
-        ca, sa = evaluate_models(cnn, clf, svd, X_test_blur, y_test, svd_mean)
-        cnn_accs_blur.append(ca)
-        svd_accs_blur.append(sa)
-        print(f"Blur K={k}: CNN={ca:.2f}, SVD={sa:.2f}")
-    # 4. Plots
-    plt.figure(figsize=(12, 5))
-    # Noise Plot
-    plt.subplot(1, 2, 1)
-    plt.plot(noise_levels, cnn_accs_noise, marker='o', color=BLUE_LIGHT, label='CNN')
-    plt.plot(noise_levels, svd_accs_noise, marker='s', color=BLUE_DEEP, label='SVD')
-    plt.xlabel(r'Gaussian Noise ($\sigma$)')
-    plt.ylabel('Accuracy')
-    plt.title('Robustness vs Noise')
-    plt.legend()
-    plt.grid(True)
-    # Blur Plot
-    plt.subplot(1, 2, 2)
-    plt.plot(blur_kernels, cnn_accs_blur, marker='o', color=BLUE_LIGHT, label='CNN')
-    plt.plot(blur_kernels, svd_accs_blur, marker='s', color=BLUE_DEEP, label='SVD')
-    plt.xlabel('Blur Kernel Size')
-    plt.ylabel('Accuracy')
-    plt.title('Robustness vs Blur')
-    plt.legend()
-    plt.grid(True)
-    plt.savefig(os.path.join(config.RESULTS_DIR, 'fig_07_degradation_curves.png'))
-    plt.close()
-    # 5. Visualizing Breakdown
-    # Find a sample that CNN gets right at noise=0.1 but wrong at noise=0.5
-    print("Generating Breakdown Visuals...")
-    noise_high = 0.6
-    X_test_high_noise = add_noise(X_test, noise_high)
-    cnn.eval()
-    logits = cnn(X_test_high_noise)
-    preds = torch.argmax(logits, dim=1)
-    # Find failures
-    failures = (preds != y_test).nonzero(as_tuple=True)[0]
-    if len(failures) > 0:
-        idx = failures[0]
-        img_clean = X_test[idx].squeeze()
-        img_noisy = X_test_high_noise[idx].squeeze()
-        plt.figure(figsize=(8, 4))
-        plt.subplot(1, 2, 1)
-        plt.imshow(img_clean, cmap='gray')
-        plt.title(f"Clean (True: {y_test[idx]})")
-        plt.subplot(1, 2, 2)
-        plt.imshow(img_noisy, cmap='gray')
-        plt.title(f"Noisy $\\sigma$={noise_high}\nCNN Pred: {preds[idx]}")
-        plt.savefig(os.path.join(config.RESULTS_DIR, 'fig_08_breakdown_point.png'))
-        plt.close()
-    print("Experiment 5 Completed.")
-if __name__ == "__main__":
-    main()

experiments/05_manifold_learning.py DELETED Viewed

@@ -1,103 +0,0 @@
-import numpy as np
-import matplotlib.pyplot as plt
-import seaborn as sns
-from sklearn.decomposition import TruncatedSVD
-from sklearn.metrics import silhouette_score
-import umap
-import torch
-import torchvision
-from torchvision import transforms
-import os
-from matplotlib.colors import ListedColormap
-from src import config
-BLUE_DEEP = "#5E81AC"
-ORANGE = "#D08770"
-# Configure styling
-sns.set_style("whitegrid")
-plt.rcParams.update({'font.size': 12})
-def load_data():
-    """Load and filter MNIST for digits 3 and 8."""
-    print("Loading MNIST data...")
-    # Fix for Mac SSL certificate issue
-    import ssl
-    ssl._create_default_https_context = ssl._create_unverified_context
-    transform = transforms.Compose([transforms.ToTensor()])
-    # Try loading without download first
-    try:
-        dataset = torchvision.datasets.MNIST(root=config.MNIST_DIR, train=True, download=False, transform=transform)
-    except:
-        dataset = torchvision.datasets.MNIST(root=config.MNIST_DIR, train=True, download=True, transform=transform)
-    # Filter for 3 and 8
-    idx = (dataset.targets == 3) | (dataset.targets == 8)
-    dataset.targets = dataset.targets[idx]
-    dataset.data = dataset.data[idx]
-    # Flatten images (N, 784)
-    X = dataset.data.numpy().reshape(-1, 28*28).astype(np.float32) / 255.0
-    y = dataset.targets.numpy()
-    print(f"Dataset loaded: {X.shape} samples (Classes: {np.unique(y)})")
-    return X, y
-def run_experiment():
-    X, y = load_data()
-    # Subsample for UMAP speed if necessary (though MNIST 3/8 is small enough ~12k samples)
-    # We'll use full set for accurate density
-    print("\n--- Running SVD Projection (Linear) ---")
-    # Mean-center for consistency with project convention
-    X_centered = X - X.mean(axis=0)
-    svd = TruncatedSVD(n_components=2, random_state=42)
-    X_svd = svd.fit_transform(X_centered)
-    sil_svd = silhouette_score(X_svd, y)
-    print(f"SVD Silhouette Score: {sil_svd:.4f}")
-    print("\n--- Running UMAP Projection (Non-linear) ---")
-    reducer = umap.UMAP(n_components=2, random_state=42, n_neighbors=15, min_dist=0.1)
-    X_umap = reducer.fit_transform(X)
-    sil_umap = silhouette_score(X_umap, y)
-    print(f"UMAP Silhouette Score: {sil_umap:.4f}")
-    # Plotting
-    print("\nGenerating Comparison Plot...")
-    fig, axes = plt.subplots(1, 2, figsize=(16, 7))
-    # Custom cmap for 3 and 8
-    # 3 is lower index, 8 is higher. We map 3 -> blue_deep, 8 -> orange
-    cmap = ListedColormap([BLUE_DEEP, ORANGE])
-    # Plot SVD
-    scatter_svd = axes[0].scatter(X_svd[:, 0], X_svd[:, 1], c=y, cmap=cmap, alpha=0.5, s=2)
-    axes[0].set_title(f"Linear Projection (SVD)\nSilhouette Score: {sil_svd:.3f}", fontsize=14)
-    axes[0].set_xlabel("PC 1")
-    axes[0].set_ylabel("PC 2")
-    # Plot UMAP
-    scatter_umap = axes[1].scatter(X_umap[:, 0], X_umap[:, 1], c=y, cmap=cmap, alpha=0.5, s=2)
-    axes[1].set_title(f"Manifold Learning (UMAP)\nSilhouette Score: {sil_umap:.3f}", fontsize=14)
-    axes[1].set_xlabel("UMAP 1")
-    axes[1].set_ylabel("UMAP 2")
-    # Add legend
-    legend1 = axes[0].legend(*scatter_svd.legend_elements(), title="Digits")
-    axes[0].add_artist(legend1)
-    legend2 = axes[1].legend(*scatter_umap.legend_elements(), title="Digits")
-    axes[1].add_artist(legend2)
-    plt.tight_layout()
-    save_path = os.path.join(config.RESULTS_DIR, "fig_09_manifold_comparison.png")
-    plt.savefig(save_path, dpi=150)
-    print(f"Plot saved to: {save_path}")
-    return sil_svd, sil_umap
-if __name__ == "__main__":
-    run_experiment()

experiments/06_fashion_mnist_baseline.py DELETED Viewed

@@ -1,115 +0,0 @@
-# Exp 06 – Fashion-MNIST SVD baseline (replicating MNIST findings)
-import numpy as np
-import matplotlib.pyplot as plt
-import seaborn as sns
-from sklearn.decomposition import TruncatedSVD
-from sklearn.linear_model import LogisticRegression
-from sklearn.metrics import confusion_matrix, accuracy_score
-from matplotlib.colors import LinearSegmentedColormap
-import torchvision
-import torchvision.transforms as transforms
-import os
-from src import config
-# --- Configuration ---
-GRAY_LIGHT = "#D8DEE9"
-BLUE_DEEP = "#5E81AC"
-# Fashion-MNIST class names
-CLASS_NAMES = ['T-shirt', 'Trouser', 'Pullover', 'Dress', 'Coat',
-               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
-def load_fashion_mnist():
-    """Load and flatten Fashion-MNIST data."""
-    print("Loading Fashion-MNIST...")
-    transform = transforms.Compose([transforms.ToTensor()])
-    trainset = torchvision.datasets.FashionMNIST(root=config.FASHION_MNIST_DIR, train=True, download=True, transform=transform)
-    testset = torchvision.datasets.FashionMNIST(root=config.FASHION_MNIST_DIR, train=False, download=True, transform=transform)
-    X_train = trainset.data.numpy().reshape(-1, 784).astype(np.float32) / 255.0
-    y_train = trainset.targets.numpy()
-    X_test = testset.data.numpy().reshape(-1, 784).astype(np.float32) / 255.0
-    y_test = testset.targets.numpy()
-    return X_train, y_train, X_test, y_test
-def plot_confusion_matrix(y_true, y_pred, labels, filename, title):
-    """Draws and saves a confusion matrix (normalized by row = recall)."""
-    cm = confusion_matrix(y_true, y_pred, normalize='true')
-    plt.figure(figsize=(12, 10))
-    cmap = LinearSegmentedColormap.from_list("NBodyBlue", [GRAY_LIGHT, BLUE_DEEP])
-    sns.heatmap(cm, annot=True, fmt='.1%', cmap=cmap,
-                xticklabels=labels, yticklabels=labels)
-    plt.title(title)
-    plt.xlabel('Predicted')
-    plt.ylabel('True')
-    plt.tight_layout()
-    plt.savefig(os.path.join(config.RESULTS_DIR, filename), dpi=300)
-    plt.close()
-    print(f"Saved {filename}")
-def analyze_confusion_pairs(cm, class_names, top_k=5):
-    """Identify the most confused class pairs."""
-    n = len(class_names)
-    confusions = []
-    for i in range(n):
-        for j in range(n):
-            if i != j:
-                confusions.append((cm[i, j], class_names[i], class_names[j]))
-    confusions.sort(reverse=True)
-    print(f"\nTop {top_k} Confused Pairs:")
-    for rate, true_class, pred_class in confusions[:top_k]:
-        print(f"  {true_class} → {pred_class}: {rate*100:.2f}%")
-    return confusions[:top_k]
-def run_svd_baseline(X_train, y_train, X_test, y_test):
-    """Run SVD + Logistic Regression baseline."""
-    print("\n--- Running SVD Baseline (Fashion-MNIST) ---")
-    n_components = 20
-    print(f"Reducing dimension to {n_components} using SVD...")
-    # Mean-center for consistency with hybrid model's SVD layer
-    mean = np.mean(X_train, axis=0)
-    X_train_centered = X_train - mean
-    X_test_centered = X_test - mean
-    svd = TruncatedSVD(n_components=n_components, random_state=42)
-    X_train_svd = svd.fit_transform(X_train_centered)
-    X_test_svd = svd.transform(X_test_centered)
-    clf = LogisticRegression(max_iter=1000)
-    clf.fit(X_train_svd, y_train)
-    y_pred = clf.predict(X_test_svd)
-    acc = accuracy_score(y_test, y_pred)
-    print(f"SVD+LR Accuracy: {acc*100:.2f}%")
-    # Confusion Matrix
-    cm = confusion_matrix(y_test, y_pred, normalize='true')
-    plot_confusion_matrix(y_test, y_pred, CLASS_NAMES,
-                         'fig_11_fashion_svd_confusion.png',
-                         f'Fashion-MNIST SVD Confusion (k={n_components}, Acc={acc:.2%})')
-    # Analyze confusions
-    analyze_confusion_pairs(cm, CLASS_NAMES)
-    return svd, clf
-def main():
-    X_train, y_train, X_test, y_test = load_fashion_mnist()
-    svd, clf = run_svd_baseline(X_train, y_train, X_test, y_test)
-    print("\nExperiment 06 Complete.")
-    print(f"Results saved to {config.RESULTS_DIR}")
-if __name__ == "__main__":
-    main()

experiments/07_fashion_cnn_verification.py DELETED Viewed

@@ -1,145 +0,0 @@
-# Exp 07 – Fashion-MNIST CNN vs SVD confusion comparison
-import numpy as np
-import matplotlib.pyplot as plt
-import seaborn as sns
-import torch
-import torch.nn as nn
-import torch.optim as optim
-from torch.utils.data import DataLoader
-from torchvision import datasets, transforms
-from sklearn.metrics import confusion_matrix, accuracy_score
-from matplotlib.colors import LinearSegmentedColormap
-import os
-from src import config
-from src.hybrid_model import SimpleCNN
-# --- Configuration ---
-GRAY_LIGHT = "#D8DEE9"
-BLUE_DEEP = "#5E81AC"
-CLASS_NAMES = ['T-shirt', 'Trouser', 'Pullover', 'Dress', 'Coat',
-               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
-def train_cnn(train_loader, epochs=10):
-    """Train CNN on Fashion-MNIST."""
-    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
-    model = SimpleCNN(num_classes=10).to(device)
-    criterion = nn.CrossEntropyLoss()
-    optimizer = optim.Adam(model.parameters(), lr=0.001)
-    model.train()
-    for epoch in range(epochs):
-        running_loss = 0.0
-        for inputs, labels in train_loader:
-            inputs, labels = inputs.to(device), labels.to(device)
-            optimizer.zero_grad()
-            outputs = model(inputs)
-            loss = criterion(outputs, labels)
-            loss.backward()
-            optimizer.step()
-            running_loss += loss.item()
-        print(f"Epoch {epoch+1}/{epochs}, Loss: {running_loss/len(train_loader):.4f}")
-    return model
-def evaluate_cnn(model, test_loader):
-    """Evaluate CNN and return predictions."""
-    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
-    model.eval()
-    all_preds = []
-    all_targets = []
-    with torch.no_grad():
-        for inputs, labels in test_loader:
-            inputs = inputs.to(device)
-            outputs = model(inputs)
-            preds = outputs.argmax(dim=1)
-            all_preds.extend(preds.cpu().numpy())
-            all_targets.extend(labels.numpy())
-    return np.array(all_preds), np.array(all_targets)
-def plot_confusion_matrix(y_true, y_pred, labels, filename, title):
-    """Draws and saves a confusion matrix (normalized by row = recall)."""
-    cm = confusion_matrix(y_true, y_pred, normalize='true')
-    plt.figure(figsize=(12, 10))
-    cmap = LinearSegmentedColormap.from_list("NBodyBlue", [GRAY_LIGHT, BLUE_DEEP])
-    sns.heatmap(cm, annot=True, fmt='.1%', cmap=cmap,
-                xticklabels=labels, yticklabels=labels)
-    plt.title(title)
-    plt.xlabel('Predicted')
-    plt.ylabel('True')
-    plt.tight_layout()
-    plt.savefig(os.path.join(config.RESULTS_DIR, filename), dpi=300)
-    plt.close()
-    print(f"Saved {filename}")
-def analyze_confusion_improvement(svd_confusions, cnn_cm, class_names):
-    """Compare SVD vs CNN on the most confused pairs."""
-    print("\n--- Confusion Comparison: SVD vs CNN ---")
-    print(f"{'Pair':<25} {'SVD Error':<12} {'CNN Error':<12} {'Improvement':<12}")
-    print("-" * 60)
-    for svd_rate, true_class, pred_class in svd_confusions:
-        i = class_names.index(true_class)
-        j = class_names.index(pred_class)
-        cnn_rate = cnn_cm[i, j]
-        improvement = (svd_rate - cnn_rate) / svd_rate * 100 if svd_rate > 0 else 0
-        print(f"{true_class} → {pred_class:<10} {svd_rate*100:>8.2f}%    {cnn_rate*100:>8.2f}%    {improvement:>8.1f}%")
-def main():
-    print("Loading Fashion-MNIST...")
-    transform = transforms.Compose([transforms.ToTensor()])
-    trainset = datasets.FashionMNIST(root=config.FASHION_MNIST_DIR, train=True, download=True, transform=transform)
-    testset = datasets.FashionMNIST(root=config.FASHION_MNIST_DIR, train=False, download=True, transform=transform)
-    train_loader = DataLoader(trainset, batch_size=64, shuffle=True)
-    test_loader = DataLoader(testset, batch_size=1000, shuffle=False)
-    print("\n--- Training CNN on Fashion-MNIST ---")
-    model = train_cnn(train_loader, epochs=10)
-    print("\n--- Evaluating CNN ---")
-    y_pred, y_true = evaluate_cnn(model, test_loader)
-    acc = accuracy_score(y_true, y_pred)
-    print(f"CNN Accuracy: {acc*100:.2f}%")
-    # Confusion Matrix
-    cm = confusion_matrix(y_true, y_pred, normalize='true')
-    plot_confusion_matrix(y_true, y_pred, CLASS_NAMES,
-                         'fig_12_fashion_cnn_confusion.png',
-                         f'Fashion-MNIST CNN Confusion (Acc={acc:.2%})')
-    # Save model for later use
-    model_path = os.path.join("models", "cnn_fashion.pth")
-    os.makedirs("models", exist_ok=True)
-    torch.save(model.state_dict(), model_path)
-    print(f"Model saved to {model_path}")
-    # Compare with SVD (hardcoded top confusions from experiment 06)
-    # These will be updated after running experiment 06
-    svd_confusions = [
-        (0.15, 'Shirt', 'T-shirt'),
-        (0.12, 'Shirt', 'Coat'),
-        (0.10, 'Pullover', 'Coat'),
-        (0.08, 'T-shirt', 'Shirt'),
-        (0.06, 'Coat', 'Pullover'),
-    ]
-    analyze_confusion_improvement(svd_confusions, cm, CLASS_NAMES)
-    print("\nExperiment 07 Complete.")
-if __name__ == "__main__":
-    main()

experiments/08_hybrid_robustness.py DELETED Viewed

@@ -1,253 +0,0 @@
-# Exp 08 – MNIST 10-class hybrid robustness under Gaussian noise (multi-seed)
-import numpy as np
-import matplotlib.pyplot as plt
-import torch
-from torchvision import datasets, transforms
-from sklearn.decomposition import TruncatedSVD
-from sklearn.linear_model import LogisticRegression
-from sklearn.metrics import accuracy_score
-import os
-import json
-from scipy.ndimage import gaussian_filter
-from src.hybrid_model import SimpleCNN, HybridSVDCNN, create_svd_layer
-from src import config
-# --- Configuration ---
-BLUE_LIGHT = "#88C0D0"
-BLUE_DEEP = "#5E81AC"
-ORANGE = "#D08770"
-RED = "#BF616A"
-SVD_K = 20
-NOISE_LEVELS = [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7]
-SEEDS = [42, 123, 456]
-BLUR_SIGMA = 1.5
-def load_mnist():
-    """Load MNIST test data."""
-    transform = transforms.Compose([transforms.ToTensor()])
-    testset = datasets.MNIST(root=config.MNIST_DIR, train=False, download=True, transform=transform)
-    trainset = datasets.MNIST(root=config.MNIST_DIR, train=True, download=True, transform=transform)
-    X_train = trainset.data.numpy().reshape(-1, 784).astype(np.float32) / 255.0
-    y_train = trainset.targets.numpy()
-    X_test = testset.data.numpy().reshape(-1, 784).astype(np.float32) / 255.0
-    y_test = testset.targets.numpy()
-    return X_train, y_train, X_test, y_test
-def add_gaussian_noise(X, sigma):
-    """Add Gaussian noise to images."""
-    noise = np.random.randn(*X.shape) * sigma
-    X_noisy = X + noise
-    return np.clip(X_noisy, 0, 1)
-def set_seeds(seed: int) -> None:
-    np.random.seed(seed)
-    torch.manual_seed(seed)
-def evaluate_svd(svd, clf, X_test, y_test, mean):
-    """Evaluate SVD+LR model (with mean-centering)."""
-    X_test_centered = X_test - mean
-    X_test_svd = svd.transform(X_test_centered)
-    y_pred = clf.predict(X_test_svd)
-    return accuracy_score(y_test, y_pred)
-def evaluate_cnn(model, X_test, y_test, device):
-    """Evaluate CNN model."""
-    model.eval()
-    X_tensor = torch.tensor(X_test.reshape(-1, 1, 28, 28), dtype=torch.float32).to(device)
-    with torch.no_grad():
-        outputs = model(X_tensor)
-        preds = outputs.argmax(dim=1).cpu().numpy()
-    return accuracy_score(y_test, preds)
-def evaluate_blur_cnn(cnn, X_test, y_test, device, blur_sigma=BLUR_SIGMA):
-    """Sanity baseline: if SVD is just smoothing, blur should do equally well."""
-    X_blurred = np.array([
-        gaussian_filter(img.reshape(28, 28), sigma=blur_sigma).flatten()
-        for img in X_test
-    ])
-    X_blurred = np.clip(X_blurred, 0, 1)
-    return evaluate_cnn(cnn, X_blurred, y_test, device)
-def load_pretrained_cnn(device) -> SimpleCNN:
-    """Load the pretrained CNN from models/ for stable, reproducible evaluation."""
-    model = SimpleCNN(num_classes=10).to(device)
-    model.load_state_dict(torch.load(config.CNN_MODEL_PATH, map_location=device))
-    model.eval()
-    return model
-def train_svd_model(X_train, y_train, n_components=SVD_K):
-    """SVD+LR baseline. Mean-centered to match hybrid model's SVD layer."""
-    mean = np.mean(X_train, axis=0)
-    X_centered = X_train - mean
-    svd = TruncatedSVD(n_components=n_components, random_state=42)
-    X_train_svd = svd.fit_transform(X_centered)
-    clf = LogisticRegression(max_iter=1000)
-    clf.fit(X_train_svd, y_train)
-    return svd, clf, mean
-def plot_robustness_comparison(results, filename, std_results=None):
-    """Plot robustness curves for all models, optionally with error bands."""
-    plt.figure(figsize=(10, 6))
-    colors = {
-        'CNN': BLUE_LIGHT,
-        'SVD': BLUE_DEEP,
-        'Hybrid': RED,
-        'Blur+CNN': ORANGE,
-    }
-    markers = {'CNN': 'o', 'SVD': 's', 'Hybrid': '^', 'Blur+CNN': 'D'}
-    for model_name, accuracies in results.items():
-        plt.plot(NOISE_LEVELS, accuracies,
-                 color=colors[model_name],
-                 marker=markers[model_name],
-                 linewidth=2, markersize=8,
-                 label=f'{model_name}')
-        # Add shaded error band if std is available
-        if std_results and model_name in std_results:
-            mean = np.array(accuracies)
-            std = np.array(std_results[model_name])
-            plt.fill_between(NOISE_LEVELS, mean - std, mean + std,
-                             color=colors[model_name], alpha=0.15)
-    plt.xlabel('Noise Level (σ)', fontsize=12)
-    plt.ylabel('Accuracy', fontsize=12)
-    plt.title('Model Robustness Under Gaussian Noise', fontsize=14)
-    plt.legend(fontsize=11)
-    plt.grid(True, alpha=0.3)
-    plt.ylim(0.0, 1.05)
-    # Add annotations
-    plt.axhline(y=0.9, color='gray', linestyle='--', alpha=0.5)
-    plt.text(0.65, 0.91, '90% threshold', fontsize=10, color='gray')
-    plt.tight_layout()
-    plt.savefig(os.path.join(config.RESULTS_DIR, filename), dpi=300)
-    plt.close()
-    print(f"Saved {filename}")
-def run_single_seed(seed, X_train, y_train, X_test, y_test, device, cnn):
-    """Run one full evaluation pass with a given random seed."""
-    set_seeds(seed)
-    print(f"\n{'='*50}")
-    print(f"  Seed = {seed}")
-    print(f"{'='*50}")
-    # Train SVD+LR (with mean-centering, consistent with hybrid)
-    svd, svd_clf, svd_mean = train_svd_model(X_train, y_train)
-    # Create Hybrid Model
-    svd_layer = create_svd_layer(X_train, n_components=SVD_K)
-    hybrid = HybridSVDCNN(svd_layer, cnn).to(device)
-    results = {'CNN': [], 'SVD': [], 'Hybrid': [], 'Blur+CNN': []}
-    for sigma in NOISE_LEVELS:
-        X_test_noisy = add_gaussian_noise(X_test, sigma)
-        acc_svd = evaluate_svd(svd, svd_clf, X_test_noisy, y_test, svd_mean)
-        acc_cnn = evaluate_cnn(cnn, X_test_noisy, y_test, device)
-        acc_hybrid = evaluate_cnn(hybrid, X_test_noisy, y_test, device)
-        acc_blur = evaluate_blur_cnn(cnn, X_test_noisy, y_test, device)
-        results['SVD'].append(acc_svd)
-        results['CNN'].append(acc_cnn)
-        results['Hybrid'].append(acc_hybrid)
-        results['Blur+CNN'].append(acc_blur)
-        print(f"  σ={sigma:.1f}  SVD={acc_svd*100:5.2f}%  CNN={acc_cnn*100:5.2f}%  "
-              f"Hybrid={acc_hybrid*100:5.2f}%  Blur+CNN={acc_blur*100:5.2f}%")
-    return results
-def main():
-    print("Loading MNIST...")
-    X_train, y_train, X_test, y_test = load_mnist()
-    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
-    print(f"Using device: {device}")
-    # Load pretrained CNN (shared across all seeds)
-    print("\nLoading pretrained CNN...")
-    cnn = load_pretrained_cnn(device)
-    # Run over multiple seeds
-    all_runs = []  # list of {model_name: [acc_per_sigma]}
-    for seed in SEEDS:
-        run_results = run_single_seed(seed, X_train, y_train, X_test, y_test, device, cnn)
-        all_runs.append(run_results)
-    # Aggregate: compute mean ± std across seeds
-    model_names = ['CNN', 'SVD', 'Hybrid', 'Blur+CNN']
-    mean_results = {}
-    std_results = {}
-    for name in model_names:
-        all_accs = np.array([run[name] for run in all_runs])  # (n_seeds, n_noise_levels)
-        mean_results[name] = all_accs.mean(axis=0).tolist()
-        std_results[name] = all_accs.std(axis=0).tolist()
-    # Plot comparison (mean with error band)
-    plot_robustness_comparison(mean_results, 'fig_10_hybrid_robustness.png', std_results)
-    # Save raw numbers for reproducibility / app usage
-    out_json = {
-        "dataset": "MNIST",
-        "task": "10-class",
-        "noise_levels": NOISE_LEVELS,
-        "seeds": SEEDS,
-        "results_mean": {k: [round(x, 4) for x in v] for k, v in mean_results.items()},
-        "results_std": {k: [round(x, 4) for x in v] for k, v in std_results.items()},
-        "results": {k: [round(x, 4) for x in v] for k, v in mean_results.items()},  # backward compat
-        "svd_components": SVD_K,
-        "svd_centering": True,
-        "blur_sigma": BLUR_SIGMA,
-        "cnn_epochs": 5,
-        "notes": "Mean over 3 seeds. SVD baseline uses explicit mean-centering for consistency with hybrid layer.",
-    }
-    json_path = os.path.join(config.RESULTS_DIR, "robustness_mnist_noise.json")
-    with open(json_path, "w", encoding="utf-8") as f:
-        json.dump(out_json, f, indent=2)
-    print(f"\nSaved robustness JSON to {json_path}")
-    # Summary table
-    print("\n--- Summary (mean ± std across seeds) ---")
-    header = f"{'Model':<12} {'Clean':<16} {'σ=0.3':<16} {'σ=0.5':<16} {'σ=0.7':<16}"
-    print(header)
-    print("-" * len(header))
-    for name in model_names:
-        m = mean_results[name]
-        s = std_results[name]
-        # indices: 0=clean, 3=0.3, 5=0.5, 7=0.7
-        print(f"{name:<12} "
-              f"{m[0]*100:5.2f}±{s[0]*100:.2f}%   "
-              f"{m[3]*100:5.2f}±{s[3]*100:.2f}%   "
-              f"{m[5]*100:5.2f}±{s[5]*100:.2f}%   "
-              f"{m[7]*100:5.2f}±{s[7]*100:.2f}%")
-    print("\nExperiment 08 Complete.")
-if __name__ == "__main__":
-    main()

experiments/09_fashion_hybrid_robustness.py DELETED Viewed

@@ -1,189 +0,0 @@
-# Exp 09 – Fashion-MNIST hybrid robustness under Gaussian noise
-import numpy as np
-import matplotlib.pyplot as plt
-import torch
-from torchvision import transforms
-import torchvision
-from sklearn.decomposition import TruncatedSVD
-from sklearn.linear_model import LogisticRegression
-from sklearn.metrics import accuracy_score
-import os
-import sys
-import json
-from src.hybrid_model import SimpleCNN, HybridSVDCNN, create_svd_layer
-from src import config
-# --- Configuration ---
-BLUE_LIGHT = "#88C0D0"
-BLUE_DEEP = "#5E81AC"
-RED = "#BF616A"
-SVD_K = 20
-NOISE_LEVELS = [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7]
-def load_fashion_mnist():
-    """Load Fashion-MNIST test data."""
-    print("Loading Fashion-MNIST...")
-    transform = transforms.Compose([transforms.ToTensor()])
-    # Use ./data/fashion to match other scripts
-    trainset = torchvision.datasets.FashionMNIST(root=config.FASHION_MNIST_DIR, train=True, download=True, transform=transform)
-    testset = torchvision.datasets.FashionMNIST(root=config.FASHION_MNIST_DIR, train=False, download=True, transform=transform)
-    X_train = trainset.data.numpy().reshape(-1, 784).astype(np.float32) / 255.0
-    y_train = trainset.targets.numpy()
-    X_test = testset.data.numpy().reshape(-1, 784).astype(np.float32) / 255.0
-    y_test = testset.targets.numpy()
-    return X_train, y_train, X_test, y_test
-def add_gaussian_noise(X, sigma):
-    """Add Gaussian noise to images."""
-    noise = np.random.randn(*X.shape) * sigma
-    X_noisy = X + noise
-    return np.clip(X_noisy, 0, 1)
-def set_seeds(seed: int) -> None:
-    np.random.seed(seed)
-    torch.manual_seed(seed)
-def load_pretrained_cnn(device) -> SimpleCNN:
-    """Load the pretrained Fashion-MNIST CNN from models/ for stable evaluation."""
-    model = SimpleCNN(num_classes=10).to(device)
-    model.load_state_dict(torch.load(config.CNN_FASHION_MODEL_PATH, map_location=device))
-    model.eval()
-    return model
-def evaluate_svd(svd, clf, X_test, y_test, mean=None):
-    """Evaluate SVD+LR model (with optional mean-centering)."""
-    X = X_test - mean if mean is not None else X_test
-    X_test_svd = svd.transform(X)
-    y_pred = clf.predict(X_test_svd)
-    return accuracy_score(y_test, y_pred)
-def evaluate_cnn(model, X_test, y_test, device):
-    """Evaluate CNN model."""
-    model.eval()
-    X_tensor = torch.tensor(X_test.reshape(-1, 1, 28, 28), dtype=torch.float32).to(device)
-    with torch.no_grad():
-        outputs = model(X_tensor)
-        preds = outputs.argmax(dim=1).cpu().numpy()
-    return accuracy_score(y_test, preds)
-def train_svd_model(X_train, y_train, n_components=SVD_K):
-    """Train SVD + Logistic Regression with explicit mean-centering."""
-    print(f"Training SVD (k={n_components}) on shape {X_train.shape}...")
-    sys.stdout.flush()
-    # Mean-center for consistency with hybrid model's SVD layer
-    mean = np.mean(X_train, axis=0)
-    X_centered = X_train - mean
-    svd = TruncatedSVD(n_components=n_components, algorithm='randomized', n_iter=5, random_state=42)
-    X_train_svd = svd.fit_transform(X_centered)
-    print("SVD fitted. Training Logistic Regression...")
-    # Increase iterations to avoid premature convergence warnings on 60k samples
-    clf = LogisticRegression(max_iter=1000)
-    clf.fit(X_train_svd, y_train)
-    print(f"SVD Explained Variance: {svd.explained_variance_ratio_.sum()*100:.2f}%")
-    return svd, clf, mean
-def plot_robustness_comparison(results, filename):
-    """Plot robustness curves for all models."""
-    plt.figure(figsize=(10, 6))
-    colors = {'CNN': BLUE_LIGHT, 'SVD': BLUE_DEEP, 'Hybrid': RED}
-    markers = {'CNN': 'o', 'SVD': 's', 'Hybrid': '^'}
-    for model_name, accuracies in results.items():
-        plt.plot(NOISE_LEVELS, accuracies,
-                 color=colors[model_name],
-                 marker=markers[model_name],
-                 linewidth=2, markersize=8,
-                 label=f'{model_name}')
-    plt.xlabel('Noise Level (σ)', fontsize=12)
-    plt.ylabel('Accuracy', fontsize=12)
-    plt.title('Fashion-MNIST: Model Robustness Under Gaussian Noise', fontsize=14)
-    plt.legend(fontsize=11)
-    plt.grid(True, alpha=0.3)
-    plt.ylim(0.0, 1.05)
-    plt.tight_layout()
-    plt.savefig(os.path.join(config.RESULTS_DIR, filename), dpi=300)
-    plt.close()
-    print(f"Saved {filename}")
-def main():
-    set_seeds(42)
-    # 1. Load Data
-    X_train, y_train, X_test, y_test = load_fashion_mnist()
-    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
-    # 2. Train Models
-    # SVD
-    svd, svd_clf, svd_mean = train_svd_model(X_train, y_train)
-    # CNN (pretrained)
-    cnn = load_pretrained_cnn(device)
-    # Hybrid
-    svd_layer = create_svd_layer(X_train, n_components=SVD_K)
-    hybrid = HybridSVDCNN(svd_layer, cnn).to(device)
-    # 3. Evaluate
-    print("\n--- Evaluating Robustness on Fashion-MNIST ---")
-    results = {'CNN': [], 'SVD': [], 'Hybrid': []}
-    for sigma in NOISE_LEVELS:
-        print(f"\nNoise σ = {sigma}")
-        X_test_noisy = add_gaussian_noise(X_test, sigma)
-        # SVD
-        acc_svd = evaluate_svd(svd, svd_clf, X_test_noisy, y_test, svd_mean)
-        results['SVD'].append(acc_svd)
-        print(f"  SVD:    {acc_svd*100:.2f}%")
-        # CNN
-        acc_cnn = evaluate_cnn(cnn, X_test_noisy, y_test, device)
-        results['CNN'].append(acc_cnn)
-        print(f"  CNN:    {acc_cnn*100:.2f}%")
-        # Hybrid
-        acc_hybrid = evaluate_cnn(hybrid, X_test_noisy, y_test, device)
-        results['Hybrid'].append(acc_hybrid)
-        print(f"  Hybrid: {acc_hybrid*100:.2f}%")
-    # Save raw numbers for reproducibility / app usage
-    out_json = {
-        "dataset": "Fashion-MNIST",
-        "task": "10-class",
-        "noise_levels": NOISE_LEVELS,
-        "results": {k: [float(x) for x in v] for k, v in results.items()},
-        "svd_components": 20,
-        "cnn_epochs": 5,
-        "notes": "Numbers are evaluated on Fashion-MNIST test set with test-time Gaussian noise.",
-    }
-    json_path = os.path.join(config.RESULTS_DIR, "robustness_fashion_noise.json")
-    with open(json_path, "w", encoding="utf-8") as f:
-        json.dump(out_json, f, indent=2)
-    print(f"Saved robustness JSON to {json_path}")
-    # 5. Summary Table
-    print("\n--- Summary (Fashion-MNIST) ---")
-    print(f"{'Model':<10} {'Clean':<10} {'σ=0.3':<10} {'σ=0.5':<10} {'σ=0.7':<10}")
-    print("-" * 50)
-    for model_name in ['CNN', 'SVD', 'Hybrid']:
-        accs = results[model_name]
-        print(f"{model_name:<10} {accs[0]*100:>6.2f}%   {accs[3]*100:>6.2f}%   {accs[5]*100:>6.2f}%   {accs[7]*100:>6.2f}%")
-if __name__ == "__main__":
-    main()

experiments/10_ablation_study.py DELETED Viewed

@@ -1,344 +0,0 @@
-# Exp 10 – Ablation Study: Depth vs Non-linearity Contribution
-# Systematically test the independent contributions of depth and non-linearity
-import numpy as np
-import matplotlib.pyplot as plt
-import torch
-import torch.nn as nn
-import torch.optim as optim
-from torch.utils.data import TensorDataset, DataLoader
-from torchvision import datasets, transforms
-from sklearn.linear_model import LogisticRegression
-from sklearn.metrics import accuracy_score
-from sklearn.model_selection import train_test_split
-import os
-from src import config
-# --- Configuration ---
-BLUE_DEEP = "#5E81AC"
-ORANGE = "#D08770"
-GREEN = "#A3BE8C"
-RED = "#BF616A"
-PURPLE = "#B48EAD"
-SEED = 42
-BATCH_SIZE = 64
-EPOCHS = 10
-LR = 0.001
-def set_seeds(seed):
-    """Set all random seeds for reproducibility."""
-    np.random.seed(seed)
-    torch.manual_seed(seed)
-    torch.cuda.manual_seed_all(seed)
-class ShallowLinear(nn.Module):
-    """Single layer linear model (no activation)."""
-    def __init__(self, num_classes=10):
-        super().__init__()
-        self.fc = nn.Linear(784, num_classes)
-    def forward(self, x):
-        x = x.view(-1, 784)
-        return self.fc(x)
-class ShallowNonLinear(nn.Module):
-    """Single layer with ReLU activation."""
-    def __init__(self, num_classes=10, hidden_size=128):
-        super().__init__()
-        self.fc1 = nn.Linear(784, hidden_size)
-        self.fc2 = nn.Linear(hidden_size, num_classes)
-    def forward(self, x):
-        x = x.view(-1, 784)
-        x = torch.relu(self.fc1(x))
-        return self.fc2(x)
-class DeepLinear(nn.Module):
-    """Two hidden layers without activation (identity mapping)."""
-    def __init__(self, num_classes=10, hidden_size=128):
-        super().__init__()
-        self.fc1 = nn.Linear(784, hidden_size)
-        self.fc2 = nn.Linear(hidden_size, hidden_size)
-        self.fc3 = nn.Linear(hidden_size, num_classes)
-    def forward(self, x):
-        x = x.view(-1, 784)
-        x = self.fc1(x)  # No activation
-        x = self.fc2(x)  # No activation
-        return self.fc3(x)
-class DeepNonLinear(nn.Module):
-    """Two hidden layers with ReLU activation (similar to CNN complexity)."""
-    def __init__(self, num_classes=10, hidden_size=128):
-        super().__init__()
-        self.fc1 = nn.Linear(784, hidden_size)
-        self.fc2 = nn.Linear(hidden_size, hidden_size)
-        self.fc3 = nn.Linear(hidden_size, num_classes)
-    def forward(self, x):
-        x = x.view(-1, 784)
-        x = torch.relu(self.fc1(x))
-        x = torch.relu(self.fc2(x))
-        return self.fc3(x)
-class SimpleCNN(nn.Module):
-    """2-conv CNN for comparison (from hybrid_model)."""
-    def __init__(self, num_classes=10):
-        super().__init__()
-        self.conv1 = nn.Conv2d(1, 16, kernel_size=3, padding=1)
-        self.pool = nn.MaxPool2d(2, 2)
-        self.conv2 = nn.Conv2d(16, 32, kernel_size=3, padding=1)
-        self.fc1 = nn.Linear(32 * 7 * 7, 128)
-        self.fc2 = nn.Linear(128, num_classes)
-    def forward(self, x):
-        x = self.pool(torch.relu(self.conv1(x)))
-        x = self.pool(torch.relu(self.conv2(x)))
-        x = x.view(-1, 32 * 7 * 7)
-        x = torch.relu(self.fc1(x))
-        x = self.fc2(x)
-        return x
-def load_mnist():
-    """Load MNIST train/test data."""
-    transform = transforms.Compose([transforms.ToTensor()])
-    trainset = datasets.MNIST(root=config.MNIST_DIR, train=True, download=True, transform=transform)
-    testset = datasets.MNIST(root=config.MNIST_DIR, train=False, download=True, transform=transform)
-    X_train = trainset.data.numpy().reshape(-1, 784).astype(np.float32) / 255.0
-    y_train = trainset.targets.numpy()
-    X_test = testset.data.numpy().reshape(-1, 784).astype(np.float32) / 255.0
-    y_test = testset.targets.numpy()
-    return X_train, y_train, X_test, y_test
-def train_model(model, X_train, y_train, X_val, y_val, device, epochs=EPOCHS):
-    """Train a PyTorch model with validation tracking."""
-    model = model.to(device)
-    criterion = nn.CrossEntropyLoss()
-    optimizer = optim.Adam(model.parameters(), lr=LR)
-    # Convert to tensors
-    X_train_t = torch.tensor(X_train, dtype=torch.float32)
-    y_train_t = torch.tensor(y_train, dtype=torch.long)
-    X_val_t = torch.tensor(X_val, dtype=torch.float32)
-    y_val_t = torch.tensor(y_val, dtype=torch.long)
-    # Reshape for CNN if needed
-    if isinstance(model, SimpleCNN):
-        X_train_t = X_train_t.view(-1, 1, 28, 28)
-        X_val_t = X_val_t.view(-1, 1, 28, 28)
-    train_dataset = TensorDataset(X_train_t, y_train_t)
-    train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True)
-    history = {'train_acc': [], 'val_acc': []}
-    for epoch in range(epochs):
-        model.train()
-        train_correct = 0
-        train_total = 0
-        for inputs, labels in train_loader:
-            inputs, labels = inputs.to(device), labels.to(device)
-            optimizer.zero_grad()
-            outputs = model(inputs)
-            loss = criterion(outputs, labels)
-            loss.backward()
-            optimizer.step()
-            _, predicted = outputs.max(1)
-            train_total += labels.size(0)
-            train_correct += predicted.eq(labels).sum().item()
-        train_acc = 100.0 * train_correct / train_total
-        # Validation
-        model.eval()
-        with torch.no_grad():
-            X_val_batch = X_val_t.to(device)
-            y_val_batch = y_val_t.to(device)
-            outputs = model(X_val_batch)
-            _, predicted = outputs.max(1)
-            val_acc = 100.0 * predicted.eq(y_val_batch).sum().item() / len(y_val_batch)
-        history['train_acc'].append(train_acc)
-        history['val_acc'].append(val_acc)
-        if (epoch + 1) % 2 == 0:
-            print(f"  Epoch {epoch+1}/{epochs}: Train Acc: {train_acc:.2f}%, Val Acc: {val_acc:.2f}%")
-    return model, history
-def evaluate_model(model, X_test, y_test, device):
-    """Evaluate model on test set."""
-    model.eval()
-    X_test_t = torch.tensor(X_test, dtype=torch.float32)
-    if isinstance(model, SimpleCNN):
-        X_test_t = X_test_t.view(-1, 1, 28, 28)
-    with torch.no_grad():
-        X_test_batch = X_test_t.to(device)
-        outputs = model(X_test_batch)
-        _, predicted = outputs.max(1)
-        accuracy = 100.0 * predicted.eq(torch.tensor(y_test).to(device)).sum().item() / len(y_test)
-    return accuracy
-def plot_ablation_results(results, filename):
-    """Plot ablation study results."""
-    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))
-    # Bar chart: Test accuracy
-    models = list(results.keys())
-    test_accs = [results[m]['test_acc'] for m in models]
-    colors = [BLUE_DEEP, GREEN, PURPLE, ORANGE, RED]
-    ax1.bar(range(len(models)), test_accs, color=colors, alpha=0.8)
-    ax1.set_xticks(range(len(models)))
-    ax1.set_xticklabels(models, rotation=30, ha='right')
-    ax1.set_ylabel('Test Accuracy (%)')
-    ax1.set_title('Ablation Study: Architecture Comparison')
-    ax1.grid(axis='y', alpha=0.3)
-    ax1.set_ylim([85, 100])
-    # Add value labels
-    for i, acc in enumerate(test_accs):
-        ax1.text(i, acc + 0.5, f'{acc:.2f}%', ha='center', va='bottom', fontsize=9)
-    # Learning curves
-    for i, (model_name, data) in enumerate(results.items()):
-        if 'val_acc' in data:
-            epochs = range(1, len(data['val_acc']) + 1)
-            ax2.plot(epochs, data['val_acc'], label=model_name, color=colors[i], linewidth=2)
-    ax2.set_xlabel('Epoch')
-    ax2.set_ylabel('Validation Accuracy (%)')
-    ax2.set_title('Training Dynamics')
-    ax2.legend()
-    ax2.grid(alpha=0.3)
-    plt.tight_layout()
-    plt.savefig(os.path.join(config.RESULTS_DIR, filename), dpi=300, bbox_inches='tight')
-    plt.close()
-    print(f"Saved {filename}")
-def main():
-    set_seeds(SEED)
-    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
-    print(f"Using device: {device}\n")
-    # Load data
-    print("Loading MNIST...")
-    X_train_full, y_train_full, X_test, y_test = load_mnist()
-    # Split train into train/val
-    X_train, X_val, y_train, y_val = train_test_split(
-        X_train_full, y_train_full, test_size=0.2, random_state=SEED, stratify=y_train_full
-    )
-    print(f"Train: {len(X_train)}, Val: {len(X_val)}, Test: {len(X_test)}\n")
-    results = {}
-    # 1. Shallow Linear
-    print("="*60)
-    print("Training: Shallow Linear (baseline)")
-    print("="*60)
-    model = ShallowLinear(num_classes=10)
-    model, history = train_model(model, X_train, y_train, X_val, y_val, device)
-    test_acc = evaluate_model(model, X_test, y_test, device)
-    results['Shallow Linear'] = {'test_acc': test_acc, 'val_acc': history['val_acc']}
-    print(f"Test Accuracy: {test_acc:.2f}%\n")
-    # 2. Shallow Non-Linear
-    print("="*60)
-    print("Training: Shallow Non-Linear (+ ReLU)")
-    print("="*60)
-    model = ShallowNonLinear(num_classes=10)
-    model, history = train_model(model, X_train, y_train, X_val, y_val, device)
-    test_acc = evaluate_model(model, X_test, y_test, device)
-    results['Shallow NonLinear'] = {'test_acc': test_acc, 'val_acc': history['val_acc']}
-    print(f"Test Accuracy: {test_acc:.2f}%\n")
-    # 3. Deep Linear
-    print("="*60)
-    print("Training: Deep Linear (+ Depth, no ReLU)")
-    print("="*60)
-    model = DeepLinear(num_classes=10)
-    model, history = train_model(model, X_train, y_train, X_val, y_val, device)
-    test_acc = evaluate_model(model, X_test, y_test, device)
-    results['Deep Linear'] = {'test_acc': test_acc, 'val_acc': history['val_acc']}
-    print(f"Test Accuracy: {test_acc:.2f}%\n")
-    # 4. Deep Non-Linear
-    print("="*60)
-    print("Training: Deep Non-Linear (+ Depth + ReLU)")
-    print("="*60)
-    model = DeepNonLinear(num_classes=10)
-    model, history = train_model(model, X_train, y_train, X_val, y_val, device)
-    test_acc = evaluate_model(model, X_test, y_test, device)
-    results['Deep NonLinear'] = {'test_acc': test_acc, 'val_acc': history['val_acc']}
-    print(f"Test Accuracy: {test_acc:.2f}%\n")
-    # 5. CNN (for reference)
-    print("="*60)
-    print("Training: CNN (convolutional + non-linear)")
-    print("="*60)
-    model = SimpleCNN(num_classes=10)
-    # Reshape data for CNN
-    X_train_cnn = X_train.reshape(-1, 28, 28)
-    X_val_cnn = X_val.reshape(-1, 28, 28)
-    X_test_cnn = X_test.reshape(-1, 28, 28)
-    model, history = train_model(model, X_train_cnn, y_train, X_val_cnn, y_val, device)
-    test_acc = evaluate_model(model, X_test_cnn, y_test, device)
-    results['CNN'] = {'test_acc': test_acc, 'val_acc': history['val_acc']}
-    print(f"Test Accuracy: {test_acc:.2f}%\n")
-    # Summary
-    print("="*60)
-    print("ABLATION STUDY SUMMARY")
-    print("="*60)
-    for model_name, data in results.items():
-        print(f"{model_name:20s}: {data['test_acc']:.2f}%")
-    # Analysis
-    print("\n" + "="*60)
-    print("KEY INSIGHTS")
-    print("="*60)
-    shallow_linear = results['Shallow Linear']['test_acc']
-    shallow_nonlinear = results['Shallow NonLinear']['test_acc']
-    deep_linear = results['Deep Linear']['test_acc']
-    deep_nonlinear = results['Deep NonLinear']['test_acc']
-    nonlinearity_gain = shallow_nonlinear - shallow_linear
-    depth_gain = deep_linear - shallow_linear
-    combined_gain = deep_nonlinear - shallow_linear
-    print(f"Non-linearity alone (shallow):   +{nonlinearity_gain:.2f} pp")
-    print(f"Depth alone (linear):             +{depth_gain:.2f} pp")
-    print(f"Depth + Non-linearity (combined): +{combined_gain:.2f} pp")
-    print(f"CNN (convolutional structure):    +{results['CNN']['test_acc'] - shallow_linear:.2f} pp")
-    # Plot results
-    plot_ablation_results(results, 'fig_13_ablation_study.png')
-    print("\n✓ Ablation study complete!")
-if __name__ == "__main__":
-    main()

experiments/11_learning_curves.py DELETED Viewed

@@ -1,228 +0,0 @@
-# Exp 11 – Learning Curves Visualization
-# Generate training/validation loss and accuracy curves from saved training history
-import matplotlib.pyplot as plt
-import pickle
-import os
-import numpy as np
-from src import config
-# --- Configuration ---
-BLUE_DEEP = "#5E81AC"
-ORANGE = "#D08770"
-GRAY_LIGHT = "#D8DEE9"
-def plot_learning_curves(history, title, filename):
-    """
-    Plot training and validation curves for loss and accuracy.
-    Args:
-        history: Dictionary with keys 'train_loss', 'val_loss', 'train_acc', 'val_acc'
-        title: Plot title
-        filename: Output filename
-    """
-    epochs = range(1, len(history['train_loss']) + 1)
-    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))
-    # Loss curves
-    ax1.plot(epochs, history['train_loss'], label='Training Loss',
-             color=BLUE_DEEP, linewidth=2, marker='o', markersize=4)
-    ax1.plot(epochs, history['val_loss'], label='Validation Loss',
-             color=ORANGE, linewidth=2, marker='s', markersize=4)
-    ax1.set_xlabel('Epoch', fontsize=12)
-    ax1.set_ylabel('Loss', fontsize=12)
-    ax1.set_title('Training and Validation Loss', fontsize=13, fontweight='bold')
-    ax1.legend(loc='best', fontsize=10)
-    ax1.grid(alpha=0.3)
-    # Highlight best validation loss
-    best_val_epoch = np.argmin(history['val_loss'])
-    best_val_loss = history['val_loss'][best_val_epoch]
-    ax1.axvline(x=best_val_epoch + 1, color='red', linestyle='--', alpha=0.5, linewidth=1)
-    ax1.plot(best_val_epoch + 1, best_val_loss, 'r*', markersize=15,
-             label=f'Best Val Loss: {best_val_loss:.4f} @ Epoch {best_val_epoch + 1}')
-    ax1.legend(loc='best', fontsize=9)
-    # Accuracy curves
-    ax2.plot(epochs, history['train_acc'], label='Training Accuracy',
-             color=BLUE_DEEP, linewidth=2, marker='o', markersize=4)
-    ax2.plot(epochs, history['val_acc'], label='Validation Accuracy',
-             color=ORANGE, linewidth=2, marker='s', markersize=4)
-    ax2.set_xlabel('Epoch', fontsize=12)
-    ax2.set_ylabel('Accuracy (%)', fontsize=12)
-    ax2.set_title('Training and Validation Accuracy', fontsize=13, fontweight='bold')
-    ax2.legend(loc='best', fontsize=10)
-    ax2.grid(alpha=0.3)
-    # Highlight best validation accuracy
-    best_val_epoch = np.argmax(history['val_acc'])
-    best_val_acc = history['val_acc'][best_val_epoch]
-    ax2.axvline(x=best_val_epoch + 1, color='red', linestyle='--', alpha=0.5, linewidth=1)
-    ax2.plot(best_val_epoch + 1, best_val_acc, 'r*', markersize=15,
-             label=f'Best Val Acc: {best_val_acc:.2f}% @ Epoch {best_val_epoch + 1}')
-    ax2.legend(loc='best', fontsize=9)
-    plt.suptitle(title, fontsize=15, fontweight='bold', y=1.02)
-    plt.tight_layout()
-    plt.savefig(os.path.join(config.RESULTS_DIR, filename), dpi=300, bbox_inches='tight')
-    plt.close()
-    print(f"✓ Saved {filename}")
-def analyze_overfitting(history):
-    """Analyze training dynamics for overfitting indicators."""
-    train_acc = history['train_acc']
-    val_acc = history['val_acc']
-    train_loss = history['train_loss']
-    val_loss = history['val_loss']
-    # Calculate gaps
-    final_acc_gap = train_acc[-1] - val_acc[-1]
-    final_loss_gap = val_loss[-1] - train_loss[-1]
-    # Check for divergence (sign of overfitting)
-    mid_point = len(train_acc) // 2
-    early_acc_gap = np.mean(train_acc[:mid_point]) - np.mean(val_acc[:mid_point])
-    late_acc_gap = np.mean(train_acc[mid_point:]) - np.mean(val_acc[mid_point:])
-    gap_increase = late_acc_gap - early_acc_gap
-    print("\n" + "="*60)
-    print("OVERFITTING ANALYSIS")
-    print("="*60)
-    print(f"Final Train Accuracy:      {train_acc[-1]:.2f}%")
-    print(f"Final Validation Accuracy: {val_acc[-1]:.2f}%")
-    print(f"Accuracy Gap:              {final_acc_gap:.2f} pp")
-    print(f"Loss Gap:                  {final_loss_gap:.4f}")
-    print(f"Gap Increase (early→late): {gap_increase:.2f} pp")
-    if final_acc_gap > 5.0:
-        print("\n⚠️  WARNING: Significant train-val accuracy gap detected (>5 pp)")
-        print("   Consider: regularization, dropout, or early stopping")
-    elif gap_increase > 2.0:
-        print("\n⚠️  WARNING: Train-val gap widening over time")
-        print("   Model may be starting to overfit")
-    else:
-        print("\n✓ No significant overfitting detected")
-    # Best epoch analysis
-    best_epoch = np.argmax(val_acc)
-    total_epochs = len(val_acc)
-    print(f"\nBest validation accuracy achieved at epoch {best_epoch + 1}/{total_epochs}")
-    if best_epoch < total_epochs - 2:
-        print(f"⚠️  Training continued for {total_epochs - best_epoch - 1} epochs after best model")
-        print("   Early stopping could have saved training time")
-    return {
-        'final_acc_gap': final_acc_gap,
-        'final_loss_gap': final_loss_gap,
-        'gap_increase': gap_increase,
-        'best_epoch': best_epoch + 1,
-        'total_epochs': total_epochs
-    }
-def plot_comparative_curves(histories, labels, filename):
-    """
-    Plot multiple models' learning curves for comparison.
-    Args:
-        histories: List of history dictionaries
-        labels: List of model names
-        filename: Output filename
-    """
-    colors = [BLUE_DEEP, ORANGE, "#A3BE8C", "#BF616A", "#B48EAD"]
-    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))
-    for i, (history, label) in enumerate(zip(histories, labels)):
-        epochs = range(1, len(history['val_loss']) + 1)
-        color = colors[i % len(colors)]
-        # Validation loss
-        ax1.plot(epochs, history['val_loss'], label=label,
-                color=color, linewidth=2, marker='o', markersize=3)
-        # Validation accuracy
-        ax2.plot(epochs, history['val_acc'], label=label,
-                color=color, linewidth=2, marker='o', markersize=3)
-    ax1.set_xlabel('Epoch', fontsize=12)
-    ax1.set_ylabel('Validation Loss', fontsize=12)
-    ax1.set_title('Validation Loss Comparison', fontsize=13, fontweight='bold')
-    ax1.legend(loc='best', fontsize=10)
-    ax1.grid(alpha=0.3)
-    ax2.set_xlabel('Epoch', fontsize=12)
-    ax2.set_ylabel('Validation Accuracy (%)', fontsize=12)
-    ax2.set_title('Validation Accuracy Comparison', fontsize=13, fontweight='bold')
-    ax2.legend(loc='best', fontsize=10)
-    ax2.grid(alpha=0.3)
-    plt.tight_layout()
-    plt.savefig(os.path.join(config.RESULTS_DIR, filename), dpi=300, bbox_inches='tight')
-    plt.close()
-    print(f"✓ Saved {filename}")
-def main():
-    print("="*60)
-    print("LEARNING CURVES VISUALIZATION")
-    print("="*60)
-    # Load CNN training history
-    history_path = os.path.join(config.MODELS_DIR, 'cnn_10class_history.pkl')
-    if os.path.exists(history_path):
-        print(f"\nLoading training history from {history_path}")
-        with open(history_path, 'rb') as f:
-            history = pickle.load(f)
-        print(f"Loaded {len(history['train_loss'])} epochs of training data")
-        # Plot learning curves
-        plot_learning_curves(
-            history,
-            'CNN Training Dynamics (MNIST 10-class)',
-            'fig_14_learning_curves.png'
-        )
-        # Analyze for overfitting
-        analysis = analyze_overfitting(history)
-    else:
-        print(f"\n⚠️  Training history not found at {history_path}")
-        print("Please run 'python src/train_models.py' first to generate training history")
-        return
-    # Check for Fashion-MNIST history
-    fashion_history_path = os.path.join(config.MODELS_DIR, 'cnn_fashion_history.pkl')
-    if os.path.exists(fashion_history_path):
-        print(f"\nLoading Fashion-MNIST training history...")
-        with open(fashion_history_path, 'rb') as f:
-            fashion_history = pickle.load(f)
-        plot_learning_curves(
-            fashion_history,
-            'CNN Training Dynamics (Fashion-MNIST)',
-            'fig_15_learning_curves_fashion.png'
-        )
-        # Comparative plot
-        print("\nGenerating comparative learning curves...")
-        plot_comparative_curves(
-            [history, fashion_history],
-            ['MNIST', 'Fashion-MNIST'],
-            'fig_16_learning_curves_comparison.png'
-        )
-    print("\n" + "="*60)
-    print("✓ Learning curves visualization complete!")
-    print("="*60)
-    print(f"Results saved to: {config.RESULTS_DIR}")
-if __name__ == "__main__":
-    main()

experiments/12_roc_analysis.py DELETED Viewed

@@ -1,291 +0,0 @@
-# Exp 12 – ROC Curve Analysis for Hard Classification Pairs
-# Particularly focused on the challenging 3 vs 8 classification
-import numpy as np
-import matplotlib.pyplot as plt
-import torch
-from torchvision import datasets, transforms
-from sklearn.decomposition import TruncatedSVD
-from sklearn.linear_model import LogisticRegression
-from sklearn.metrics import roc_curve, auc, roc_auc_score
-import pickle
-import os
-from src.hybrid_model import SimpleCNN
-from src import config
-# --- Configuration ---
-BLUE_DEEP = "#5E81AC"
-ORANGE = "#D08770"
-GREEN = "#A3BE8C"
-RED = "#BF616A"
-SEED = 42
-def set_seeds(seed):
-    """Set random seeds for reproducibility."""
-    np.random.seed(seed)
-    torch.manual_seed(seed)
-def load_mnist_subset(digit_a=3, digit_b=8):
-    """Load MNIST subset with only two specified digits."""
-    transform = transforms.Compose([transforms.ToTensor()])
-    testset = datasets.MNIST(root=config.MNIST_DIR, train=False, download=True, transform=transform)
-    # Filter for specific digits
-    mask = (testset.targets == digit_a) | (testset.targets == digit_b)
-    X = testset.data[mask].numpy().astype(np.float32) / 255.0
-    y = testset.targets[mask].numpy()
-    # Binary labels: 0 for digit_a, 1 for digit_b
-    y_binary = (y == digit_b).astype(int)
-    print(f"Loaded {len(X)} samples: {np.sum(y_binary == 0)} digit-{digit_a}, {np.sum(y_binary == 1)} digit-{digit_b}")
-    return X, y_binary, digit_a, digit_b
-def get_svd_probabilities(X, svd_path=config.SVD_MODEL_PATH):
-    """Get probability scores from SVD+LR model."""
-    with open(svd_path, 'rb') as f:
-        svd = pickle.load(f)
-    # Get mean from saved model
-    if hasattr(svd, '_train_mean'):
-        mean = svd._train_mean
-    else:
-        mean = np.zeros(784)
-    X_flat = X.reshape(-1, 784)
-    X_centered = X_flat - mean
-    X_svd = svd.transform(X_centered)
-    # Train a binary classifier on 3 vs 8
-    print("Training binary SVD+LR classifier for ROC analysis...")
-    X_train_full, y_train_full = load_mnist_binary_train()
-    X_train_centered = X_train_full - mean
-    X_train_svd = svd.transform(X_train_centered)
-    clf = LogisticRegression(random_state=SEED, max_iter=1000)
-    clf.fit(X_train_svd, y_train_full)
-    # Get probability scores (for positive class)
-    probs = clf.predict_proba(X_svd)[:, 1]
-    return probs
-def load_mnist_binary_train(digit_a=3, digit_b=8):
-    """Load training data for binary classification."""
-    transform = transforms.Compose([transforms.ToTensor()])
-    trainset = datasets.MNIST(root=config.MNIST_DIR, train=True, download=True, transform=transform)
-    mask = (trainset.targets == digit_a) | (trainset.targets == digit_b)
-    X = trainset.data[mask].numpy().reshape(-1, 784).astype(np.float32) / 255.0
-    y = trainset.targets[mask].numpy()
-    y_binary = (y == digit_b).astype(int)
-    return X, y_binary
-def get_cnn_probabilities(X, cnn_path=config.CNN_MODEL_PATH, digit_a=3, digit_b=8):
-    """Get probability scores from CNN model."""
-    device = torch.device('cpu')
-    cnn = SimpleCNN(num_classes=10)
-    cnn.load_state_dict(torch.load(cnn_path, map_location=device))
-    cnn.eval()
-    X_tensor = torch.tensor(X, dtype=torch.float32).view(-1, 1, 28, 28)
-    with torch.no_grad():
-        outputs = cnn(X_tensor)
-        probs = torch.softmax(outputs, dim=1).numpy()
-    # Extract probabilities for the two digits of interest
-    prob_a = probs[:, digit_a]
-    prob_b = probs[:, digit_b]
-    # Normalize to binary probability (probability of digit_b given only these two options)
-    binary_probs = prob_b / (prob_a + prob_b + 1e-10)
-    return binary_probs
-def plot_roc_curves(results, digit_a, digit_b, filename):
-    """
-    Plot ROC curves for different models.
-    Args:
-        results: Dictionary with model names as keys and (fpr, tpr, auc) as values
-        digit_a, digit_b: The two digits being classified
-        filename: Output filename
-    """
-    plt.figure(figsize=(10, 8))
-    colors = [BLUE_DEEP, ORANGE, GREEN, RED]
-    for i, (model_name, (fpr, tpr, roc_auc)) in enumerate(results.items()):
-        plt.plot(fpr, tpr, color=colors[i % len(colors)], linewidth=2.5,
-                label=f'{model_name} (AUC = {roc_auc:.4f})', marker='o', markersize=4, markevery=20)
-    # Random classifier baseline
-    plt.plot([0, 1], [0, 1], color='gray', linestyle='--', linewidth=2, label='Random Classifier (AUC = 0.5000)')
-    plt.xlim([0.0, 1.0])
-    plt.ylim([0.0, 1.05])
-    plt.xlabel('False Positive Rate', fontsize=13)
-    plt.ylabel('True Positive Rate', fontsize=13)
-    plt.title(f'ROC Curves: Digit {digit_a} vs {digit_b} Classification', fontsize=14, fontweight='bold')
-    plt.legend(loc='lower right', fontsize=11)
-    plt.grid(alpha=0.3)
-    # Highlight perfect classification region
-    plt.fill_between([0, 0, 0.1], [0.9, 1, 1], alpha=0.1, color='green', label='_nolegend_')
-    plt.text(0.02, 0.95, 'Ideal Region', fontsize=9, color='green', alpha=0.7)
-    plt.tight_layout()
-    plt.savefig(os.path.join(config.RESULTS_DIR, filename), dpi=300, bbox_inches='tight')
-    plt.close()
-    print(f"✓ Saved {filename}")
-def plot_roc_zoom(results, digit_a, digit_b, filename):
-    """Plot zoomed-in ROC curves focusing on high-sensitivity region."""
-    plt.figure(figsize=(10, 8))
-    colors = [BLUE_DEEP, ORANGE, GREEN, RED]
-    for i, (model_name, (fpr, tpr, roc_auc)) in enumerate(results.items()):
-        plt.plot(fpr, tpr, color=colors[i % len(colors)], linewidth=2.5,
-                label=f'{model_name} (AUC = {roc_auc:.4f})', marker='o', markersize=5, markevery=10)
-    plt.plot([0, 1], [0, 1], color='gray', linestyle='--', linewidth=2, alpha=0.5)
-    # Zoom to interesting region
-    plt.xlim([0.0, 0.2])
-    plt.ylim([0.8, 1.0])
-    plt.xlabel('False Positive Rate', fontsize=13)
-    plt.ylabel('True Positive Rate', fontsize=13)
-    plt.title(f'ROC Curves (Zoomed): Digit {digit_a} vs {digit_b}', fontsize=14, fontweight='bold')
-    plt.legend(loc='lower right', fontsize=11)
-    plt.grid(alpha=0.3)
-    plt.tight_layout()
-    plt.savefig(os.path.join(config.RESULTS_DIR, filename), dpi=300, bbox_inches='tight')
-    plt.close()
-    print(f"✓ Saved {filename}")
-def analyze_threshold_performance(y_true, y_probs, thresholds=[0.3, 0.5, 0.7, 0.9]):
-    """Analyze model performance at different decision thresholds."""
-    print("\n" + "="*60)
-    print("THRESHOLD SENSITIVITY ANALYSIS")
-    print("="*60)
-    print(f"{'Threshold':<12} {'Accuracy':<12} {'TPR':<12} {'FPR':<12} {'Precision':<12}")
-    print("-"*60)
-    for threshold in thresholds:
-        y_pred = (y_probs >= threshold).astype(int)
-        # Calculate metrics
-        tp = np.sum((y_pred == 1) & (y_true == 1))
-        tn = np.sum((y_pred == 0) & (y_true == 0))
-        fp = np.sum((y_pred == 1) & (y_true == 0))
-        fn = np.sum((y_pred == 0) & (y_true == 1))
-        accuracy = (tp + tn) / len(y_true)
-        tpr = tp / (tp + fn) if (tp + fn) > 0 else 0
-        fpr = fp / (fp + tn) if (fp + tn) > 0 else 0
-        precision = tp / (tp + fp) if (tp + fp) > 0 else 0
-        print(f"{threshold:<12.1f} {accuracy:<12.4f} {tpr:<12.4f} {fpr:<12.4f} {precision:<12.4f}")
-def main():
-    set_seeds(SEED)
-    print("="*60)
-    print("ROC CURVE ANALYSIS: Digit 3 vs 8")
-    print("="*60)
-    # Load test data
-    print("\nLoading test data...")
-    X_test, y_test, digit_a, digit_b = load_mnist_subset(digit_a=3, digit_b=8)
-    results = {}
-    # SVD+LR model
-    print("\n" + "-"*60)
-    print("Evaluating SVD+LR model...")
-    print("-"*60)
-    try:
-        svd_probs = get_svd_probabilities(X_test)
-        fpr_svd, tpr_svd, _ = roc_curve(y_test, svd_probs)
-        auc_svd = auc(fpr_svd, tpr_svd)
-        results['SVD+LR'] = (fpr_svd, tpr_svd, auc_svd)
-        print(f"✓ SVD+LR AUC: {auc_svd:.4f}")
-        analyze_threshold_performance(y_test, svd_probs)
-    except Exception as e:
-        print(f"⚠️  Could not evaluate SVD model: {e}")
-    # CNN model
-    print("\n" + "-"*60)
-    print("Evaluating CNN model...")
-    print("-"*60)
-    try:
-        cnn_probs = get_cnn_probabilities(X_test, digit_a=digit_a, digit_b=digit_b)
-        fpr_cnn, tpr_cnn, _ = roc_curve(y_test, cnn_probs)
-        auc_cnn = auc(fpr_cnn, tpr_cnn)
-        results['CNN'] = (fpr_cnn, tpr_cnn, auc_cnn)
-        print(f"✓ CNN AUC: {auc_cnn:.4f}")
-        analyze_threshold_performance(y_test, cnn_probs)
-    except Exception as e:
-        print(f"⚠️  Could not evaluate CNN model: {e}")
-    # Plot results
-    if len(results) > 0:
-        print("\n" + "="*60)
-        print("Generating ROC visualizations...")
-        print("="*60)
-        plot_roc_curves(results, digit_a, digit_b, 'fig_17_roc_curves.png')
-        plot_roc_zoom(results, digit_a, digit_b, 'fig_18_roc_curves_zoom.png')
-        # Summary
-        print("\n" + "="*60)
-        print("SUMMARY")
-        print("="*60)
-        for model_name, (_, _, roc_auc) in results.items():
-            print(f"{model_name:15s}: AUC = {roc_auc:.4f}")
-        # Interpretation
-        print("\n" + "="*60)
-        print("INTERPRETATION")
-        print("="*60)
-        print("• AUC = 1.0: Perfect classifier")
-        print("• AUC = 0.9-1.0: Excellent")
-        print("• AUC = 0.8-0.9: Good")
-        print("• AUC = 0.7-0.8: Fair")
-        print("• AUC = 0.5: Random classifier")
-        if 'CNN' in results and 'SVD+LR' in results:
-            auc_diff = results['CNN'][2] - results['SVD+LR'][2]
-            print(f"\nCNN advantage over SVD: {auc_diff:.4f} AUC points")
-            if auc_diff > 0.05:
-                print("→ CNN shows substantially better discrimination ability")
-            elif auc_diff > 0.02:
-                print("→ CNN shows moderately better discrimination")
-            else:
-                print("→ Models show similar discrimination ability")
-        print("\n✓ ROC analysis complete!")
-    else:
-        print("\n⚠️  No models could be evaluated. Please train models first.")
-if __name__ == "__main__":
-    main()

experiments/13_per_class_metrics.py DELETED Viewed

@@ -1,366 +0,0 @@
-# Exp 13 – Per-Class Performance Metrics
-# Generate detailed classification report with Precision, Recall, F1-Score for each digit
-import numpy as np
-import matplotlib.pyplot as plt
-import pandas as pd
-import seaborn as sns
-import torch
-from torchvision import datasets, transforms
-from sklearn.decomposition import TruncatedSVD
-from sklearn.linear_model import LogisticRegression
-from sklearn.metrics import classification_report, precision_recall_fscore_support, confusion_matrix
-import pickle
-import os
-from src.hybrid_model import SimpleCNN
-from src import config
-# --- Configuration ---
-BLUE_DEEP = "#5E81AC"
-ORANGE = "#D08770"
-GREEN = "#A3BE8C"
-RED = "#BF616A"
-SEED = 42
-def set_seeds(seed):
-    """Set random seeds for reproducibility."""
-    np.random.seed(seed)
-    torch.manual_seed(seed)
-def load_mnist():
-    """Load MNIST test data."""
-    transform = transforms.Compose([transforms.ToTensor()])
-    testset = datasets.MNIST(root=config.MNIST_DIR, train=False, download=True, transform=transform)
-    X_test = testset.data.numpy().reshape(-1, 784).astype(np.float32) / 255.0
-    y_test = testset.targets.numpy()
-    return X_test, y_test
-def load_mnist_train():
-    """Load MNIST training data for SVD."""
-    transform = transforms.Compose([transforms.ToTensor()])
-    trainset = datasets.MNIST(root=config.MNIST_DIR, train=True, download=True, transform=transform)
-    X_train = trainset.data.numpy().reshape(-1, 784).astype(np.float32) / 255.0
-    y_train = trainset.targets.numpy()
-    return X_train, y_train
-def evaluate_svd(X_test, y_test, svd_path=config.SVD_MODEL_PATH):
-    """Evaluate SVD+LR model."""
-    print("Evaluating SVD+LR model...")
-    # Load SVD model
-    with open(svd_path, 'rb') as f:
-        svd = pickle.load(f)
-    if hasattr(svd, '_train_mean'):
-        mean = svd._train_mean
-    else:
-        mean = np.zeros(784)
-    # Transform test data
-    X_test_centered = X_test - mean
-    X_test_svd = svd.transform(X_test_centered)
-    # Train classifier
-    print("  Training LogisticRegression classifier...")
-    X_train, y_train = load_mnist_train()
-    X_train_centered = X_train - mean
-    X_train_svd = svd.transform(X_train_centered)
-    clf = LogisticRegression(random_state=SEED, max_iter=1000)
-    clf.fit(X_train_svd, y_train)
-    # Predict
-    y_pred = clf.predict(X_test_svd)
-    return y_pred
-def evaluate_cnn(X_test, y_test, cnn_path=config.CNN_MODEL_PATH):
-    """Evaluate CNN model."""
-    print("Evaluating CNN model...")
-    device = torch.device('cpu')
-    cnn = SimpleCNN(num_classes=10)
-    cnn.load_state_dict(torch.load(cnn_path, map_location=device))
-    cnn.eval()
-    X_tensor = torch.tensor(X_test, dtype=torch.float32).view(-1, 1, 28, 28)
-    with torch.no_grad():
-        outputs = cnn(X_tensor)
-        y_pred = outputs.argmax(dim=1).numpy()
-    return y_pred
-def create_metrics_table(y_true, y_pred, model_name):
-    """Create per-class metrics table."""
-    precision, recall, f1, support = precision_recall_fscore_support(y_true, y_pred, average=None)
-    # Create DataFrame
-    df = pd.DataFrame({
-        'Class': [f'Digit {i}' for i in range(10)],
-        'Precision': precision,
-        'Recall': recall,
-        'F1-Score': f1,
-        'Support': support
-    })
-    # Add overall metrics
-    precision_avg, recall_avg, f1_avg, _ = precision_recall_fscore_support(y_true, y_pred, average='weighted')
-    overall = pd.DataFrame({
-        'Class': ['Overall (weighted)'],
-        'Precision': [precision_avg],
-        'Recall': [recall_avg],
-        'F1-Score': [f1_avg],
-        'Support': [len(y_true)]
-    })
-    df = pd.concat([df, overall], ignore_index=True)
-    return df
-def plot_metrics_comparison(df_svd, df_cnn, filename):
-    """Plot side-by-side comparison of metrics."""
-    fig, axes = plt.subplots(1, 3, figsize=(18, 6))
-    metrics = ['Precision', 'Recall', 'F1-Score']
-    colors = [BLUE_DEEP, ORANGE]
-    # Exclude overall row for bar chart
-    df_svd_plot = df_svd.iloc[:-1]
-    df_cnn_plot = df_cnn.iloc[:-1]
-    x = np.arange(10)
-    width = 0.35
-    for i, metric in enumerate(metrics):
-        ax = axes[i]
-        svd_values = df_svd_plot[metric].values
-        cnn_values = df_cnn_plot[metric].values
-        ax.bar(x - width/2, svd_values, width, label='SVD+LR', color=colors[0], alpha=0.8)
-        ax.bar(x + width/2, cnn_values, width, label='CNN', color=colors[1], alpha=0.8)
-        ax.set_xlabel('Digit Class', fontsize=12)
-        ax.set_ylabel(metric, fontsize=12)
-        ax.set_title(f'{metric} by Digit', fontsize=13, fontweight='bold')
-        ax.set_xticks(x)
-        ax.set_xticklabels([str(i) for i in range(10)])
-        ax.legend()
-        ax.grid(axis='y', alpha=0.3)
-        ax.set_ylim([0.7, 1.0])
-    plt.tight_layout()
-    plt.savefig(os.path.join(config.RESULTS_DIR, filename), dpi=300, bbox_inches='tight')
-    plt.close()
-    print(f"✓ Saved {filename}")
-def plot_metrics_heatmap(df_svd, df_cnn, filename):
-    """Create heatmap showing per-class metrics."""
-    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 5))
-    # Prepare data (exclude overall row)
-    svd_data = df_svd.iloc[:-1][['Precision', 'Recall', 'F1-Score']].T
-    cnn_data = df_cnn.iloc[:-1][['Precision', 'Recall', 'F1-Score']].T
-    svd_data.columns = [str(i) for i in range(10)]
-    cnn_data.columns = [str(i) for i in range(10)]
-    # SVD heatmap
-    sns.heatmap(svd_data, annot=True, fmt='.3f', cmap='Blues',
-                vmin=0.7, vmax=1.0, ax=ax1, cbar_kws={'label': 'Score'})
-    ax1.set_title('SVD+LR Per-Class Metrics', fontsize=13, fontweight='bold')
-    ax1.set_xlabel('Digit Class', fontsize=12)
-    ax1.set_ylabel('Metric', fontsize=12)
-    # CNN heatmap
-    sns.heatmap(cnn_data, annot=True, fmt='.3f', cmap='Oranges',
-                vmin=0.7, vmax=1.0, ax=ax2, cbar_kws={'label': 'Score'})
-    ax2.set_title('CNN Per-Class Metrics', fontsize=13, fontweight='bold')
-    ax2.set_xlabel('Digit Class', fontsize=12)
-    ax2.set_ylabel('Metric', fontsize=12)
-    plt.tight_layout()
-    plt.savefig(os.path.join(config.RESULTS_DIR, filename), dpi=300, bbox_inches='tight')
-    plt.close()
-    print(f"✓ Saved {filename}")
-def identify_hard_pairs(y_true, y_pred_svd, y_pred_cnn):
-    """Identify digit pairs that are frequently confused."""
-    print("\n" + "="*60)
-    print("HARD CLASSIFICATION PAIRS")
-    print("="*60)
-    cm_svd = confusion_matrix(y_true, y_pred_svd, normalize='true')
-    cm_cnn = confusion_matrix(y_true, y_pred_cnn, normalize='true')
-    # Find top confusions (excluding diagonal)
-    np.fill_diagonal(cm_svd, 0)
-    np.fill_diagonal(cm_cnn, 0)
-    print("\nTop 5 SVD+LR Confusions:")
-    svd_confusions = []
-    for i in range(10):
-        for j in range(10):
-            if i != j and cm_svd[i, j] > 0.01:
-                svd_confusions.append((i, j, cm_svd[i, j]))
-    svd_confusions.sort(key=lambda x: x[2], reverse=True)
-    for i, (true_class, pred_class, rate) in enumerate(svd_confusions[:5], 1):
-        print(f"  {i}. {true_class} → {pred_class}: {rate*100:.2f}%")
-    print("\nTop 5 CNN Confusions:")
-    cnn_confusions = []
-    for i in range(10):
-        for j in range(10):
-            if i != j and cm_cnn[i, j] > 0.01:
-                cnn_confusions.append((i, j, cm_cnn[i, j]))
-    cnn_confusions.sort(key=lambda x: x[2], reverse=True)
-    for i, (true_class, pred_class, rate) in enumerate(cnn_confusions[:5], 1):
-        print(f"  {i}. {true_class} → {pred_class}: {rate*100:.2f}%")
-    # Compare improvements
-    print("\n" + "="*60)
-    print("CNN IMPROVEMENTS OVER SVD+LR")
-    print("="*60)
-    improvements = []
-    for i in range(10):
-        for j in range(10):
-            if i != j:
-                improvement = cm_svd[i, j] - cm_cnn[i, j]
-                if improvement > 0.01:  # More than 1% improvement
-                    improvements.append((i, j, improvement))
-    improvements.sort(key=lambda x: x[2], reverse=True)
-    print("Top pairs where CNN reduced confusion:")
-    for i, (true_class, pred_class, improvement) in enumerate(improvements[:5], 1):
-        svd_rate = cm_svd[true_class, pred_class]
-        cnn_rate = cm_cnn[true_class, pred_class]
-        print(f"  {i}. {true_class} → {pred_class}: {svd_rate*100:.2f}% → {cnn_rate*100:.2f}% "
-              f"(Δ = -{improvement*100:.2f} pp)")
-def save_reports_to_csv(df_svd, df_cnn):
-    """Save detailed reports to CSV files."""
-    svd_path = os.path.join(config.RESULTS_DIR, 'per_class_metrics_svd.csv')
-    cnn_path = os.path.join(config.RESULTS_DIR, 'per_class_metrics_cnn.csv')
-    df_svd.to_csv(svd_path, index=False, float_format='%.4f')
-    df_cnn.to_csv(cnn_path, index=False, float_format='%.4f')
-    print(f"\n✓ Saved detailed metrics to:")
-    print(f"  - {svd_path}")
-    print(f"  - {cnn_path}")
-def main():
-    set_seeds(SEED)
-    print("="*60)
-    print("PER-CLASS PERFORMANCE METRICS ANALYSIS")
-    print("="*60)
-    # Load test data
-    print("\nLoading MNIST test data...")
-    X_test, y_test = load_mnist()
-    print(f"Loaded {len(X_test)} test samples")
-    # Evaluate models
-    try:
-        print("\n" + "-"*60)
-        y_pred_svd = evaluate_svd(X_test, y_test)
-        print("✓ SVD+LR evaluation complete")
-    except Exception as e:
-        print(f"⚠️  Could not evaluate SVD model: {e}")
-        return
-    try:
-        print("\n" + "-"*60)
-        y_pred_cnn = evaluate_cnn(X_test, y_test)
-        print("✓ CNN evaluation complete")
-    except Exception as e:
-        print(f"⚠️  Could not evaluate CNN model: {e}")
-        return
-    # Generate metrics tables
-    print("\n" + "="*60)
-    print("GENERATING METRICS TABLES")
-    print("="*60)
-    df_svd = create_metrics_table(y_test, y_pred_svd, 'SVD+LR')
-    df_cnn = create_metrics_table(y_test, y_pred_cnn, 'CNN')
-    # Display tables
-    print("\n" + "-"*60)
-    print("SVD+LR Per-Class Metrics")
-    print("-"*60)
-    print(df_svd.to_string(index=False, float_format='%.4f'))
-    print("\n" + "-"*60)
-    print("CNN Per-Class Metrics")
-    print("-"*60)
-    print(df_cnn.to_string(index=False, float_format='%.4f'))
-    # Identify hard pairs
-    identify_hard_pairs(y_test, y_pred_svd, y_pred_cnn)
-    # Generate visualizations
-    print("\n" + "="*60)
-    print("GENERATING VISUALIZATIONS")
-    print("="*60)
-    plot_metrics_comparison(df_svd, df_cnn, 'fig_19_per_class_metrics_comparison.png')
-    plot_metrics_heatmap(df_svd, df_cnn, 'fig_20_per_class_metrics_heatmap.png')
-    # Save to CSV
-    save_reports_to_csv(df_svd, df_cnn)
-    # Summary statistics
-    print("\n" + "="*60)
-    print("SUMMARY STATISTICS")
-    print("="*60)
-    svd_overall = df_svd.iloc[-1]
-    cnn_overall = df_cnn.iloc[-1]
-    print(f"\nSVD+LR Overall:")
-    print(f"  Precision: {svd_overall['Precision']:.4f}")
-    print(f"  Recall:    {svd_overall['Recall']:.4f}")
-    print(f"  F1-Score:  {svd_overall['F1-Score']:.4f}")
-    print(f"\nCNN Overall:")
-    print(f"  Precision: {cnn_overall['Precision']:.4f}")
-    print(f"  Recall:    {cnn_overall['Recall']:.4f}")
-    print(f"  F1-Score:  {cnn_overall['F1-Score']:.4f}")
-    print(f"\nCNN Improvement:")
-    print(f"  Precision: +{(cnn_overall['Precision'] - svd_overall['Precision'])*100:.2f} pp")
-    print(f"  Recall:    +{(cnn_overall['Recall'] - svd_overall['Recall'])*100:.2f} pp")
-    print(f"  F1-Score:  +{(cnn_overall['F1-Score'] - svd_overall['F1-Score'])*100:.2f} pp")
-    print("\n" + "="*60)
-    print("✓ Per-class metrics analysis complete!")
-    print("="*60)
-if __name__ == "__main__":
-    main()

experiments/appendix_learning_curves.py ADDED Viewed

	@@ -0,0 +1,26 @@

+"""
+Appendix A – Learning Curves
+Refactored to use centralized viz utilities.
+"""
+import pickle
+import os
+from src import config, viz
+def main():
+    experiments = [
+        ('cnn_10class_history.pkl', 'MNIST 10-class CNN Training', 'fig_14_learning_curves.png'),
+        ('cnn_fashion_history.pkl', 'Fashion-MNIST CNN Training', 'fig_15_learning_curves_fashion.png')
+    ]
+    for f_name, label, out_name in experiments:
+        path = os.path.join(config.MODELS_DIR, f_name)
+        if os.path.exists(path):
+            with open(path, 'rb') as f:
+                history = pickle.load(f)
+            viz.plot_learning_curves(history, label, out_name)
+        else:
+            print(f"Skipping {f_name}: Not found at {path}.")
+if __name__ == "__main__":
+    main()

experiments/appendix_per_class_metrics.py ADDED Viewed

	@@ -0,0 +1,53 @@

+"""
+Appendix B – Per-Class Performance Metrics (MNIST)
+Refactored to use centralized utility modules.
+"""
+import torch
+import numpy as np
+from src import utils, viz, exp_utils
+def main():
+    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
+    print("Loading Models and Test Data...")
+    # Load Models (MNIST default)
+    svd_pipe, cnn = utils.load_models(dataset_name="mnist")
+    if svd_pipe is None or cnn is None:
+        return
+    X_test, y_test = utils.load_data_split(dataset_name="mnist", train=False)
+    X_test_flat = X_test.view(X_test.size(0), -1).numpy()
+    y_test_np = y_test.numpy()
+    # 1. Collect Predictions
+    print("Collecting Predictions...")
+    y_preds_dict = {}
+    # CNN Predictions
+    cnn.eval()
+    with torch.no_grad():
+        y_preds_dict['CNN'] = cnn(X_test.to(device)).argmax(dim=1).cpu().numpy()
+    # SVD+LR Predictions
+    y_preds_dict['SVD+LR'] = svd_pipe.predict(X_test_flat)
+    # 2. Print Metrics Report
+    from sklearn.metrics import recall_score, precision_score, f1_score
+    for name, y_pred in y_preds_dict.items():
+        print(f"\n--- {name} Report (Average Metrics) ---")
+        p = precision_score(y_test_np, y_pred, average='macro')
+        r = recall_score(y_test_np, y_pred, average='macro')
+        f = f1_score(y_test_np, y_pred, average='macro')
+        print(f"Macro Average: Precision={p:.3f}, Recall={r:.3f}, F1={f:.3f}")
+    # 3. Visualization: Per-Class F1 Comparison
+    viz.plot_per_class_comparison(
+        y_test_np,
+        y_preds_dict,
+        'fig_19_per_class_metrics_comparison.png'
+    )
+    print("Appendix B Completed.")
+if __name__ == "__main__":
+    main()

experiments/run_robustness_test.py ADDED Viewed

	@@ -0,0 +1,65 @@

+"""
+Unified Robustness Test Script
+Evaluates CNN, SVD, and Hybrid model performance under Gaussian noise.
+Refactored to use centralized src utilities.
+"""
+import argparse
+import torch
+import numpy as np
+from src import config, utils, viz, exp_utils
+from src.hybrid_model import HybridSVDCNN, SVDProjectionLayer
+def run_experiment(args):
+    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+    print(f"\n--- Running Robustness Test: {args.dataset.upper()} ---")
+    # 1. Load Data and Models
+    X_test, y_test = utils.load_data_split(dataset_name=args.dataset, train=False)
+    _, cnn = utils.load_models(dataset_name=args.dataset)
+    if cnn is None:
+        return
+    # 2. Fit SVD Baseline and Build Hybrid Model
+    print("Fitting SVD Baseline...")
+    X_test_flat = X_test.view(X_test.size(0), -1).numpy()
+    svd_pipe = exp_utils.fit_svd_baseline(X_test_flat, y_test.numpy(), n_components=20)
+    svd = svd_pipe.named_steps['svd']
+    scaler = svd_pipe.named_steps['scaler']
+    # Hybrid model expects mean from scaler if available
+    svd_layer = SVDProjectionLayer(svd.components_, scaler.mean_)
+    hybrid = HybridSVDCNN(svd_layer, cnn).to(device)
+    # 3. Define Noise Levels
+    sigmas = [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7]
+    results = {'CNN': [], 'SVD': [], 'Hybrid': []}
+    # 4. Evaluation Loop
+    for sigma in sigmas:
+        X_noisy = exp_utils.add_gaussian_noise(X_test, sigma)
+        results['CNN'].append(exp_utils.evaluate_classifier(cnn, X_noisy, y_test, device))
+        results['SVD'].append(exp_utils.evaluate_classifier(svd_pipe, X_noisy, y_test, is_pytorch=False))
+        results['Hybrid'].append(exp_utils.evaluate_classifier(hybrid, X_noisy, y_test, device))
+        print(f"σ={sigma:.1f} | CNN: {results['CNN'][-1]:.4f} | SVD: {results['SVD'][-1]:.4f} | Hybrid: {results['Hybrid'][-1]:.4f}")
+    # 5. Visualization
+    viz.plot_robustness_curves(
+        x_values=sigmas,
+        results_dict=results,
+        x_label='Gaussian Noise Level (σ)',
+        title=f'Robustness Analysis: {args.dataset.upper()}',
+        filename=f'fig_robustness_{args.dataset}.png'
+    )
+def main():
+    parser = argparse.ArgumentParser(description="Unified Robustness Evaluation")
+    parser.add_argument("--dataset", choices=["mnist", "fashion"], default="mnist", help="Dataset to evaluate.")
+    args = parser.parse_args()
+    run_experiment(args)
+if __name__ == "__main__":
+    main()

src/__init__.py ADDED Viewed

File without changes

src/config.py ADDED Viewed

	@@ -0,0 +1,17 @@

+import os
+# --- Paths ---
+BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
+DATA_DIR = os.path.join(BASE_DIR, "data")
+MODELS_DIR = os.path.join(BASE_DIR, "models")
+RESULTS_DIR = os.path.join(BASE_DIR, "docs", "research_results")
+for d in [DATA_DIR, MODELS_DIR, RESULTS_DIR]:
+    os.makedirs(d, exist_ok=True)
+SVD_MODEL_PATH = os.path.join(MODELS_DIR, "svd_10class.pkl")
+CNN_MODEL_PATH = os.path.join(MODELS_DIR, "cnn_10class.pth")
+FASHION_SVD_PATH = os.path.join(MODELS_DIR, "svd_fashion.pkl") # Placeholder if not exists
+FASHION_CNN_PATH = os.path.join(MODELS_DIR, "cnn_fashion.pth")
+APP_CACHE_PATH = os.path.join(DATA_DIR, "app_cache.npz")

src/exp_utils.py ADDED Viewed

	@@ -0,0 +1,68 @@

+import torch
+import numpy as np
+import torchvision.transforms as transforms
+from sklearn.metrics import accuracy_score
+from sklearn.decomposition import TruncatedSVD
+from sklearn.linear_model import LogisticRegression
+from sklearn.pipeline import Pipeline
+from sklearn.preprocessing import StandardScaler
+def fit_svd_baseline(X_train, y_train, n_components=20):
+    """Fits a linear baseline (SVD + Logistic Regression) on the fly."""
+    pipeline = Pipeline([
+        ('scaler', StandardScaler()),
+        ('svd', TruncatedSVD(n_components=n_components, random_state=42)),
+        ('logistic', LogisticRegression(max_iter=1000))
+    ])
+    pipeline.fit(X_train, y_train)
+    return pipeline
+def add_gaussian_noise(X, sigma):
+    """
+    Uniform noise addition for both torch Tensors and numpy arrays.
+    Returns the same type as input.
+    """
+    if sigma <= 0: return X
+    if torch.is_tensor(X):
+        noise = torch.randn_like(X) * sigma
+        return torch.clamp(X + noise, 0, 1)
+    else:
+        noise = np.random.randn(*X.shape) * sigma
+        return np.clip(X + noise, 0, 1)
+def add_blur(X, kernel_size):
+    """Unified blur for torch Tensors (4D: B, C, H, W)."""
+    if kernel_size <= 1:
+        return X
+    sigma = 0.1 + 0.3 * (kernel_size // 2)
+    blur_fn = transforms.GaussianBlur(kernel_size=(kernel_size, kernel_size), sigma=(sigma, sigma))
+    return blur_fn(X)
+def evaluate_classifier(model, X, y, device="cpu", is_pytorch=True):
+    """
+    Unified evaluation function.
+    Handles PyTorch models (CNN, Hybrid) and Sklearn pipelines (SVD+LR).
+    """
+    if is_pytorch:
+        model.eval()
+        model.to(device)
+        # Ensure X is 4D for CNN (B, 1, 28, 28)
+        if len(X.shape) == 2:
+            X_t = torch.as_tensor(X.reshape(-1, 1, 28, 28), dtype=torch.float32).to(device)
+        else:
+            X_t = torch.as_tensor(X, dtype=torch.float32).to(device)
+        y_t = torch.as_tensor(y, dtype=torch.long).to(device)
+        with torch.no_grad():
+            logits = model(X_t)
+            preds = torch.argmax(logits, dim=1).cpu().numpy()
+        return accuracy_score(y, preds)
+    else:
+        # Sklearn pipeline - Ensure X is flattened 2D numpy
+        if torch.is_tensor(X):
+            X_np = X.view(X.size(0), -1).cpu().numpy()
+        else:
+            X_np = X.reshape(X.shape[0], -1)
+        preds = model.predict(X_np)
+        return accuracy_score(y, preds)

src/hybrid_model.py ADDED Viewed

	@@ -0,0 +1,45 @@

+# Hybrid SVD-CNN Model
+import torch
+import torch.nn as nn
+import numpy as np
+from sklearn.decomposition import TruncatedSVD
+class SimpleCNN(nn.Module):
+    def __init__(self, num_classes=10):
+        super().__init__()
+        self.features = nn.Sequential(
+            nn.Conv2d(1, 16, 3, padding=1), nn.ReLU(), nn.MaxPool2d(2),
+            nn.Conv2d(16, 32, 3, padding=1), nn.ReLU(), nn.MaxPool2d(2)
+        )
+        self.classifier = nn.Sequential(
+            nn.Linear(32 * 7 * 7, 128), nn.ReLU(),
+            nn.Linear(128, num_classes)
+        )
+    def forward(self, x):
+        return self.classifier(self.features(x).view(x.size(0), -1))
+class SVDProjectionLayer(nn.Module):
+    def __init__(self, V_k, mean=None):
+        super().__init__()
+        self.register_buffer('V_k', torch.tensor(V_k, dtype=torch.float32))
+        self.register_buffer('mean', torch.tensor(mean, dtype=torch.float32) if mean is not None else torch.zeros(V_k.shape[1]))
+    def forward(self, x):
+        b = x.size(0)
+        x_rec = (x.view(b, -1) - self.mean) @ self.V_k.T @ self.V_k + self.mean
+        return torch.clamp(x_rec, 0, 1).view(b, 1, 28, 28)
+class HybridSVDCNN(nn.Module):
+    def __init__(self, svd_layer, cnn):
+        super().__init__()
+        self.svd_layer, self.cnn = svd_layer, cnn
+    def forward(self, x):
+        return self.cnn(self.svd_layer(x))
+def create_svd_layer(X_train, n_components=20):
+    mean = np.mean(X_train, axis=0)
+    svd = TruncatedSVD(n_components=n_components, random_state=42).fit(X_train - mean)
+    return SVDProjectionLayer(svd.components_, mean)

src/train_fashion.py ADDED Viewed

	@@ -0,0 +1,27 @@

+import torch
+import torchvision
+from torchvision import transforms
+from src.hybrid_model import SimpleCNN
+from src import config
+from src.train_models import train_cnn, set_seed
+if __name__ == "__main__":
+    set_seed()
+    print("Loading Fashion-MNIST for training...")
+    transform = transforms.Compose([transforms.ToTensor()])
+    train_dataset = torchvision.datasets.FashionMNIST(root=config.DATA_DIR, train=True, download=True, transform=transform)
+    # Extract data to tensors for train_cnn
+    X = train_dataset.data.float() / 255.0
+    y = train_dataset.targets
+    # Temporarily override CNN_MODEL_PATH for fashion
+    original_path = config.CNN_MODEL_PATH
+    config.CNN_MODEL_PATH = config.CNN_FASHION_MODEL_PATH
+    print(f"Retraining Fashion-MNIST model to {config.CNN_FASHION_MODEL_PATH}...")
+    train_cnn(X, y)
+    # Restore (optional but good practice)
+    config.CNN_MODEL_PATH = original_path
+    print("Fashion-MNIST training completed.")

src/train_models.py ADDED Viewed

	@@ -0,0 +1,72 @@

+import torch
+import torch.nn as nn
+import torch.optim as optim
+from torch.utils.data import TensorDataset, DataLoader
+from sklearn.decomposition import TruncatedSVD
+from sklearn.model_selection import train_test_split
+import pickle
+import os
+import numpy as np
+import random
+from src.hybrid_model import SimpleCNN
+from src.utils import load_data_split
+from src import config
+def set_seed(seed=42):
+    random.seed(seed); np.random.seed(seed)
+    torch.manual_seed(seed); torch.cuda.manual_seed_all(seed)
+    torch.backends.cudnn.deterministic = True
+def train_svd(X_flat, n_components=20):
+    print(f"Training SVD (k={n_components})...")
+    X_np = X_flat.numpy()
+    mean = X_np.mean(axis=0)
+    svd = TruncatedSVD(n_components=n_components, random_state=42).fit(X_np - mean)
+    svd._train_mean = mean
+    with open(config.SVD_MODEL_PATH, "wb") as f: pickle.dump(svd, f)
+    return svd
+def train_cnn(X_flat, y, batch_size=64, epochs=5):
+    X_train, X_val, y_train, y_val = train_test_split(X_flat.numpy(), y.numpy(), test_size=0.2, random_state=42, stratify=y.numpy())
+    def to_loader(X, y, shuffle=True):
+        return DataLoader(TensorDataset(torch.tensor(X).view(-1, 1, 28, 28), torch.tensor(y, dtype=torch.long)), batch_size=batch_size, shuffle=shuffle)
+    train_loader, val_loader = to_loader(X_train, y_train), to_loader(X_val, y_val, False)
+    model = SimpleCNN().to("cuda" if torch.cuda.is_available() else "cpu")
+    opt = optim.Adam(model.parameters(), lr=0.001)
+    crit = nn.CrossEntropyLoss()
+    history = {'train_loss': [], 'val_loss': [], 'train_acc': [], 'val_acc': []}
+    best_acc, best_state = 0, None
+    for epoch in range(epochs):
+        model.train()
+        t_loss, t_corr = 0, 0
+        for x, labels in train_loader:
+            x, labels = x.to(next(model.parameters()).device), labels.to(next(model.parameters()).device)
+            opt.zero_grad(); out = model(x); loss = crit(out, labels); loss.backward(); opt.step()
+            t_loss += loss.item(); t_corr += (out.argmax(1) == labels).sum().item()
+        model.eval(); v_loss, v_corr = 0, 0
+        with torch.no_grad():
+            for x, labels in val_loader:
+                x, labels = x.to(next(model.parameters()).device), labels.to(next(model.parameters()).device)
+                out = model(x); v_loss += crit(out, labels).item(); v_corr += (out.argmax(1) == labels).sum().item()
+        history['train_acc'].append(100 * t_corr / len(X_train)); history['val_acc'].append(100 * v_corr / len(X_val))
+        print(f"Epoch {epoch+1}: Train Acc {history['train_acc'][-1]:.2f}%, Val Acc {history['val_acc'][-1]:.2f}%")
+        if history['val_acc'][-1] > best_acc:
+            best_acc, best_state = history['val_acc'][-1], model.state_dict().copy()
+    model.load_state_dict(best_state)
+    torch.save(model.cpu().state_dict(), config.CNN_MODEL_PATH)
+    with open(config.CNN_MODEL_PATH.replace('.pth', '_history.pkl'), 'wb') as f: pickle.dump(history, f)
+    return model, history
+if __name__ == "__main__":
+    set_seed()
+    X, y = load_data_split()
+    train_svd(X.view(-1, 784))
+    train_cnn(X.view(-1, 784), y)

src/utils.py ADDED Viewed

	@@ -0,0 +1,116 @@

+import torch
+import torchvision.transforms as T
+import torchvision
+import numpy as np
+import os
+import pickle
+import ssl
+from src.hybrid_model import SimpleCNN
+from src import config
+import cv2
+def load_data_split(dataset_name="mnist", train=True, digits=None, flatten=False):
+    """
+    Unified entry point for data loading.
+    Supports: MNIST, Fashion-MNIST, and custom digit filtering (e.g., [3, 8]).
+    """
+    # Bypass SSL verification issues for dataset downloads
+    ssl._create_default_https_context = ssl._create_unverified_context
+    transform = T.Compose([T.ToTensor()])
+    if dataset_name.lower() == "mnist":
+        dataset = torchvision.datasets.MNIST(config.DATA_DIR, train=train, download=True, transform=transform)
+    elif dataset_name.lower() == "fashion":
+        dataset = torchvision.datasets.FashionMNIST(config.DATA_DIR, train=train, download=True, transform=transform)
+    else:
+        raise ValueError(f"Unknown dataset: {dataset_name}")
+    X = dataset.data.float() / 255.0
+    y = dataset.targets
+    # Filter for specific digits if requested (e.g., [3, 8] for binary analysis)
+    if digits is not None:
+        mask = torch.zeros(len(y), dtype=torch.bool)
+        for d in digits:
+            mask |= (y == d)
+        X = X[mask]
+        y = y[mask]
+        # Remap labels to 0, 1... for binary tasks
+        if len(digits) == 2:
+            y = torch.where(y == digits[0], torch.tensor(0), torch.tensor(1))
+    # Add channel dimension if not flattened (B, 1, 28, 28)
+    if not flatten:
+        X = X.unsqueeze(1)
+    else:
+        X = X.view(X.size(0), -1)
+    return X, y
+def load_models(dataset_name="mnist"):
+    """
+    Loads pre-trained SVD transformer and CNN model for a specific dataset.
+    Returns (svd, cnn). Either can be None if the file is missing.
+    """
+    svd_path = config.SVD_MODEL_PATH if dataset_name == "mnist" else config.FASHION_SVD_PATH
+    cnn_path = config.CNN_MODEL_PATH if dataset_name == "mnist" else config.FASHION_CNN_PATH
+    svd, cnn = None, None
+    if os.path.exists(svd_path):
+        with open(svd_path, "rb") as f:
+            svd = pickle.load(f)
+    else:
+        print(f"Note: SVD model for {dataset_name} not found at {svd_path}")
+    if os.path.exists(cnn_path):
+        cnn = SimpleCNN()
+        cnn.load_state_dict(torch.load(cnn_path, map_location="cpu"))
+        cnn.eval()
+    else:
+        print(f"Note: CNN model for {dataset_name} not found at {cnn_path}")
+    return svd, cnn
+# --- Backward Compatibility Aliases ---
+load_data = load_data_split
+def preprocess_digit(img):
+    """
+    Original preprocessing logic used by the Streamlit app.
+    Crops, resizes (20x20), and pads to 28x28.
+    """
+    if isinstance(img, torch.Tensor):
+        img = img.numpy().astype(np.uint8)
+    # 1. Threshold & Find Bounding Box
+    _, thresh = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)
+    coords = cv2.findNonZero(thresh)
+    if coords is None:
+        return torch.zeros((28, 28))
+    x, y, w, h = cv2.boundingRect(coords)
+    img_crop = img[y:y+h, x:x+w]
+    # 2. Resize to fit 20px
+    if w > h:
+        new_w = 20
+        new_h = int(h * (20 / w))
+    else:
+        new_h = 20
+        new_w = int(w * (20 / h))
+    if new_w == 0 or new_h == 0:
+        return torch.zeros((28, 28))
+    img_resize = cv2.resize(img_crop, (new_w, new_h), interpolation=cv2.INTER_AREA)
+    # 3. Center in 28x28
+    final_img = np.zeros((28, 28), dtype=np.uint8)
+    pad_y = (28 - new_h) // 2
+    pad_x = (28 - new_w) // 2
+    final_img[pad_y:pad_y+new_h, pad_x:pad_x+new_w] = img_resize
+    # 4. Normalize
+    return torch.tensor(final_img).float() / 255.0

src/viz.py ADDED Viewed

	@@ -0,0 +1,195 @@

+import matplotlib.pyplot as plt
+import seaborn as sns
+import numpy as np
+import os
+from matplotlib.colors import LinearSegmentedColormap
+from src import config
+# --- Nord Palette Colors ---
+COLOR_SVD = "#5E81AC"    # Nord 10 (Blue)
+COLOR_CNN = "#BF616A"    # Nord 11 (Red)
+COLOR_HYBRID = "#A3BE8C" # Nord 14 (Green)
+COLOR_TEXT = "#2E3440"   # Nord 0 (Dark)
+COLOR_GRID = "#D8DEE9"   # Nord 4
+def setup_style():
+    """Standardize matplotlib plots."""
+    plt.rcParams['font.family'] = 'sans-serif'
+    plt.rcParams['axes.edgecolor'] = COLOR_GRID
+    plt.rcParams['grid.alpha'] = 0.3
+    plt.rcParams['axes.labelcolor'] = COLOR_TEXT
+def save_fig(filename, dpi=300):
+    """Save plot to results directory."""
+    path = os.path.join(config.RESULTS_DIR, filename)
+    plt.tight_layout()
+    plt.savefig(path, dpi=dpi)
+    plt.close()
+    print(f"Figure saved to {path}")
+def plot_robustness_curves(x_values, results_dict, x_label, title, filename):
+    """Standardized robustness curve plotter."""
+    setup_style()
+    plt.figure(figsize=(10, 6))
+    colors = {'CNN': COLOR_CNN, 'SVD': COLOR_SVD, 'Hybrid': COLOR_HYBRID}
+    for label, accs in results_dict.items():
+        plt.plot(x_values, accs, label=label, marker='o',
+                 color=colors.get(label, '#4C566A'), linewidth=2)
+    plt.title(title, fontsize=14, fontweight='bold', pad=15)
+    plt.xlabel(x_label, fontsize=12)
+    plt.ylabel('Accuracy', fontsize=12)
+    plt.legend(frameon=True, facecolor='white', framealpha=0.8)
+    plt.grid(True)
+    save_fig(filename)
+def plot_confusion_matrix(y_true, y_pred, labels, filename, title, color_end=COLOR_SVD):
+    """Normalized confusion matrix with Nord-consistent coloring."""
+    from sklearn.metrics import confusion_matrix
+    setup_style()
+    cm = confusion_matrix(y_true, y_pred, normalize='true')
+    plt.figure(figsize=(10, 8))
+    # Custom cmap from Light Gray to Nord Color
+    cmap = LinearSegmentedColormap.from_list("NordCustom", ["#ECEFF4", color_end])
+    sns.heatmap(cm, annot=True, fmt='.1%', cmap=cmap, xticklabels=labels, yticklabels=labels)
+    plt.title(title, fontsize=14, fontweight='bold', pad=15)
+    plt.xlabel('Predicted', fontsize=12)
+    plt.ylabel('True', fontsize=12)
+    save_fig(filename)
+def plot_singular_spectrum(singular_values, cumulative_variance, filename):
+    """Visualizes singular values and explained variance."""
+    setup_style()
+    fig, ax1 = plt.subplots(figsize=(10, 6))
+    n = len(singular_values)
+    ax1.semilogy(range(1, n+1), singular_values, color=COLOR_SVD, label='Singular Values', linewidth=2)
+    ax1.set_xlabel('Principal Component (k)', fontsize=12)
+    ax1.set_ylabel('Singular Value (Log)', color=COLOR_SVD, fontsize=12)
+    ax1.tick_params(axis='y', labelcolor=COLOR_SVD)
+    ax2 = ax1.twinx()
+    ax2.plot(range(1, n+1), cumulative_variance, color=COLOR_CNN, linestyle='--', label='Cum. Var', linewidth=2)
+    ax2.set_ylabel('Cumulative Explained Variance', color=COLOR_CNN, fontsize=12)
+    ax2.tick_params(axis='y', labelcolor=COLOR_CNN)
+    ax2.set_ylim(0, 1.05)
+    plt.title('Singular Value Spectrum & Explained Variance', fontsize=14, fontweight='bold', pad=15)
+    fig.legend(loc="upper right", bbox_to_anchor=(1,1), bbox_transform=ax1.transAxes)
+    save_fig(filename)
+def plot_interpolation_dynamics(alphas, probs_8, rec_errors, filename):
+    """Visualizes the CNN response vs SVD reconstruction error during interpolation."""
+    setup_style()
+    plt.figure(figsize=(10, 6))
+    plt.plot(alphas, probs_8, color=COLOR_CNN, label='CNN Prob(8) [Topology]', marker='o', linewidth=2)
+    plt.plot(alphas, rec_errors, color=COLOR_SVD, label='SVD Rec Error [Global Variance]', marker='s', linewidth=2)
+    plt.axvline(x=0.5, color='#4C566A', linestyle='--', alpha=0.5, label='Ambiguity Mid-point')
+    plt.title('Mechanistic Dynamics: Interpolation vs. SVD Error', fontsize=14, fontweight='bold', pad=15)
+    plt.xlabel('Alpha (0=Digit 3, 1=Digit 8)', fontsize=12)
+    plt.ylabel('Metric Value', fontsize=12)
+    plt.legend()
+    plt.grid(True)
+    save_fig(filename)
+def plot_manifold_comparison(X_svd, X_umap, y, acc_svd, acc_raw, filename):
+    """Side-by-side comparison of SVD (linear) vs UMAP (non-linear) projections."""
+    setup_style()
+    fig, axes = plt.subplots(1, 2, figsize=(15, 6))
+    colors = [COLOR_SVD, COLOR_CNN] # 3 vs 8
+    labels = ['Digit 3', 'Digit 8']
+    for i in range(2):
+        # SVD Plane
+        axes[0].scatter(X_svd[y==i, 0], X_svd[y==i, 1], label=labels[i], alpha=0.5, s=15, color=colors[i])
+        # UMAP Manifold
+        axes[1].scatter(X_umap[y==i, 0], X_umap[y==i, 1], label=labels[i], alpha=0.5, s=15, color=colors[i])
+    axes[0].set_title(f"SVD Projection (2D Subspace)\nk-NN Accuracy: {acc_svd:.2%}", fontsize=12)
+    axes[1].set_title(f"UMAP Manifold (Non-linear)\nRaw k-NN Accuracy: {acc_raw:.2%}", fontsize=12)
+    for ax in axes:
+        ax.legend()
+        ax.set_xticks([])
+        ax.set_yticks([])
+    plt.suptitle("Manifold Collapse: Linear SVD Overlap vs. Non-linear Topological Separation",
+                 fontsize=14, fontweight='bold', y=1.02)
+    save_fig(filename)
+def plot_learning_curves(history, title, filename):
+    """Standardized plotter for training history (loss and accuracy)."""
+    setup_style()
+    epochs = range(1, len(history['train_loss']) + 1)
+    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
+    # Nord palette for curves
+    COLOR_TRAIN = COLOR_SVD
+    COLOR_VAL = "#D08770" # Nord 12 (Orange)
+    # Loss Plot
+    ax1.plot(epochs, history['train_loss'], label='Train', color=COLOR_TRAIN, marker='o', markersize=4, linewidth=1.5)
+    ax1.plot(epochs, history['val_loss'], label='Val', color=COLOR_VAL, marker='s', markersize=4, linewidth=1.5)
+    ax1.set_title('Loss Dynamics', fontsize=12, fontweight='bold')
+    ax1.set_xlabel('Epoch')
+    ax1.set_ylabel('Loss')
+    ax1.legend()
+    ax1.grid(True)
+    # Accuracy Plot
+    ax2.plot(epochs, history['train_acc'], label='Train', color=COLOR_TRAIN, marker='o', markersize=4, linewidth=1.5)
+    ax2.plot(epochs, history['val_acc'], label='Val', color=COLOR_VAL, marker='s', markersize=4, linewidth=1.5)
+    ax2.set_title('Accuracy Dynamics', fontsize=12, fontweight='bold')
+    ax2.set_xlabel('Epoch')
+    ax2.set_ylabel('Accuracy')
+    ax2.legend()
+    ax2.grid(True)
+    plt.suptitle(title, fontsize=14, fontweight='bold', y=1.02)
+    save_fig(filename)
+def plot_per_class_comparison(y_test, y_preds_dict, filename):
+    """Grouped bar chart comparing F1-scores per class for multiple models."""
+    from sklearn.metrics import f1_score
+    setup_style()
+    plt.figure(figsize=(10, 6))
+    x = np.arange(10)
+    width = 0.8 / len(y_preds_dict)
+    colors = {
+        'SVD+LR': COLOR_SVD,
+        'CNN': COLOR_CNN,
+        'Hybrid': COLOR_HYBRID
+    }
+    for i, (label, y_pred) in enumerate(y_preds_dict.items()):
+        f1s = f1_score(y_test, y_pred, average=None)
+        plt.bar(x + (i - len(y_preds_dict)/2 + 0.5) * width, f1s, width,
+                label=label, color=colors.get(label, '#4C566A'), alpha=0.8)
+    plt.xticks(x)
+    plt.xlabel('Digit Class', fontsize=12)
+    plt.ylabel('F1-Score', fontsize=12)
+    plt.title('Per-Class Performance Comparison (F1-Score)', fontsize=14, fontweight='bold', pad=15)
+    plt.legend()
+    plt.grid(True, axis='y')
+    save_fig(filename)
+def plot_multi_image_grid(images, titles, rows, cols, filename, suptitle=None):
+    """Generic grid plotter for images (e.g., eigen-digits)."""
+    plt.figure(figsize=(cols * 2.5, rows * 2.5))
+    for i, (img, title) in enumerate(zip(images, titles)):
+        plt.subplot(rows, cols, i + 1)
+        plt.imshow(img, cmap='gray')
+        plt.title(title, fontsize=10)
+        plt.axis('off')
+    if suptitle:
+        plt.suptitle(suptitle, fontsize=14, fontweight='bold')
+    save_fig(filename)