Spaces:

pdjota
/

artydemo

Sleeping

Pablo Dejuan commited on Mar 23

Commit

179dfc2

1 Parent(s): 6c8d2bc

Inference and Hub UX: shared predict_topk, atomic checkpoints, upload .env

- Add _ThreeHeadPredictMixin (predict_topk / predict_topk_from_path) on both
ResNet heads; Gradio uses it with predict_format top-k JSON and config
ImageNet normalize; device selection includes MPS.
- Add spot_check_excluded_post_impressionism.py (train transforms, --top-k)
and README spot-check table with GAP/Lin+He refs and BiLSTM wording.
- train_cnn: atomic torch.save via temp+replace; weights_only=False on loads;
clear error when last.pt is corrupt (suggest cp from best.pt).
- upload_model_to_hf: load .env with override=True; required --checkpoint;
default export to data/label_maps; drop --token; python-dotenv in
requirements; Dockerfile omits redundant --export-labels-dir.
- Tests for predict_topk, spot_check helpers, atomic save, dotenv override.

Made-with: Cursor

Files changed (14) hide show

Dockerfile +1 -1
README.md +50 -2
gradio/app.py +19 -14
requirements.txt +1 -0
scripts/spot_check_excluded_post_impressionism.py +137 -0
scripts/train_cnn.py +25 -5
scripts/upload_model_to_hf.py +41 -16
src/model.py +64 -2
src/predict_format.py +9 -0
tests/test_model_architectures.py +48 -0
tests/test_predict_format.py +12 -0
tests/test_spot_check_excluded_post_impressionism.py +47 -0
tests/test_train_cnn_atomic_save.py +33 -0
tests/test_upload_model_to_hf.py +19 -0

Dockerfile CHANGED Viewed

@@ -39,7 +39,7 @@ else \
   echo \"[train] No last.pt found; training from scratch for ${EPOCHS} epochs\"; \
   python scripts/train_cnn_safe.py --arch \"$ARCH\" --epochs \"$EPOCHS\" --batch-size-primary \"$BATCH_SIZE_PRIMARY\" --batch-size-fallback \"$BATCH_SIZE_FALLBACK\"; \
 fi; \
-python scripts/upload_model_to_hf.py --repo-id \"$MODEL_REPO_ID\" --checkpoint \"checkpoints/$ARCH/best.pt\" --export-labels-dir data/label_maps; \
 kill $SERVER_PID >/dev/null 2>&1 || true \
 "]

   echo \"[train] No last.pt found; training from scratch for ${EPOCHS} epochs\"; \
   python scripts/train_cnn_safe.py --arch \"$ARCH\" --epochs \"$EPOCHS\" --batch-size-primary \"$BATCH_SIZE_PRIMARY\" --batch-size-fallback \"$BATCH_SIZE_FALLBACK\"; \
 fi; \
+python scripts/upload_model_to_hf.py --repo-id \"$MODEL_REPO_ID\" --checkpoint \"checkpoints/$ARCH/best.pt\"; \
 kill $SERVER_PID >/dev/null 2>&1 || true \
 "]

README.md CHANGED Viewed

@@ -12,14 +12,62 @@ pinned: false
 # Arty
-WikiArt **genre / style / artist** classifiers (CNN baseline and CNN-RNN). This Hugging Face **Space** runs the **Gradio** app under [`gradio/app.py`](gradio/app.py); weights load from Hub model repos and architecture from [`src/model.py`](src/model.py).
 **Training** (Docker GPU job) lives in [`Dockerfile`](Dockerfile) — use a **separate** Space pointed at the same repo if you only want training, or run locally. Do not set `sdk: docker` on this Space if you want the Gradio UI.
 ## Env (optional)
 - `BASELINE_MODEL_REPO_ID` — default `pdjota/cnn-baseline`
 - `CNNRNN_MODEL_REPO_ID` — default `pdjota/arty-cnn-rnn`
 - `HF_TOKEN` — if model repos are gated
-More detail: [`gradio/README.md`](gradio/README.md), [`docs/monorepo_gradio_space.md`](docs/monorepo_gradio_space.md).

 # Arty
+**Arty** is a multi-task WikiArt classifier: **genre**, **style**, and **artist** in one model with two architectures — a **CNN baseline** (ResNet-50 + global pooling + three heads) and a **CNN–RNN** (same backbone, **bidirectional long short-term memory (BiLSTM)** over spatial features + three heads). This Hugging Face **Space** runs the **Gradio** app in [`gradio/app.py`](gradio/app.py); **weights** load from Hub model repos and **architecture** from [`src/model.py`](src/model.py). Each architecuture corresponds to a model: [pdjota/cnn-baseline](https://huggingface.co/pdjota/cnn-baseline) or [pdjota/arty-cnn-rnn](https://huggingface.co/pdjota/arty-cnn-rnn)
 **Training** (Docker GPU job) lives in [`Dockerfile`](Dockerfile) — use a **separate** Space pointed at the same repo if you only want training, or run locally. Do not set `sdk: docker` on this Space if you want the Gradio UI.
+## About this project
+### Classification with three labels together
+**Genre** and **style** are relatively **generic**: many paintings share the same movement or subject category, and the model learns broad visual patterns that match those labels. **Artist** is **more specific** — we usually think of *who* painted it **within** a stylistic movement. Distinguishing painters who share a movement like Sisley and Monet in the impressionist movement or identifying Picasso who create a different set of paitings in symbolism or cubism become challenging. The network must pick up **fine-grained** cues (palette, brushwork, recurring motifs) that sit on top of the same broad visual cues that support style and genre.
+We use a **shared convolutional trunk** (one set of image features) and **three separate heads** (genre, style, artist). The trunk carries **generic** painting features; the heads split **coarse** (genre, style) vs **fine** (artist) decisions so the artist task can specialize without forcing a single output to encode everything at once.
+### ResNet-50 and short fine-tuning
+**ResNet-50** is a deep convolutional network built from **residual (skip) connections** so very deep stacks train without vanishing gradients ([He et al., 2016](https://arxiv.org/abs/1512.03385); trained on **ImageNet** in that work). **Transfer learning** is standard: features from early conv layers tend to transfer across related visual domains better than random initialization ([Yosinski et al., 2014](https://arxiv.org/abs/1411.1792)). For **paintings**, [Zhao et al. (2021)](https://doi.org/10.1371/journal.pone.0248414) compare the same model families **with and without** ImageNet-based transfer on WikiArt (genre / style / artist) and report strong results with pretraining — so we **fine-tune** the backbone and heads for a **limited number of epochs** instead of training from scratch at ImageNet-scale cost.
+We use an **ArtGAN-aligned** WikiArt-style index: images catalogued with consistent **genre, style, and
+artist** labels; a few broken paths are excluded. A curated dataset is on the Hub (e.g. [`pdjota/artyset`]
+(https://huggingface.co/datasets/pdjota/artyset)) for reproducible training.
+The resulting classification is already good for some examples. **Spot check (CNN baseline `best.pt` with [`scripts/spot_check_excluded_post_impressionism.py`](scripts/spot_check_excluded_post_impressionism.py)):**  five **Post_Impressionism** images under `data/wikiart_excluded/Post_Impressionism/` (not in the training index) — style top-1 and a short note:
+| Painting | Style (top-1) | Comment |
+| -------- | ------------- | ------- |
+| `henri-matisse_a-vase-with-oranges.jpg` | Post_Impressionism (~80%) | Still life; confident style match. |
+| `henri-de-toulouse-lautrec_portrait-of-vincent-van-gogh-1887.jpg` | Impressionism (~99%) | Sketchy handling reads as Impressionist; Post_Impressionism far behind. |
+| `pablo-picasso_seated-monkey-1905.jpg` | Post_Impressionism (~42%) | Close with Expressionism; **artist** top-1 Picasso (~93%). |
+| `paul-gauguin_a-seashore-1887.jpg` | Impressionism (~75%) | Post_Impressionism second (~16%). |
+| `a.y.-jackson_the-edge-of-the-maple-wood-1910.jpg` | Impressionism (~94%) | Landscape; artist head has no A.Y. Jackson class (23 ArtGAN artists). |
+### Bidirectional long short-term memory (BiLSTM) on top of the CNN
+Zhao et al. note that their setup uses **colour** information heavily and that **spatial** information could still improve classification. Standard **global average pooling (GAP)** after the last conv map **throws away layout**: it averages each channel over all spatial positions, so the classifier sees a **single vector per channel** with **no remaining (x, y) structure** ([Lin et al., 2014](https://arxiv.org/abs/1312.4400); ResNet-50 uses this pattern before its **fully connected (FC)** layer [He et al., 2016](https://arxiv.org/abs/1512.03385)). That answers “what is present” but not “how it is arranged.” We keep the same ResNet backbone, then **turn the spatial grid into a sequence** (e.g. column-wise strips), run a **bidirectional long short-term memory (BiLSTM)**, then classify. The **reasoning** is: composition, figure–ground balance, and brushstroke patterns often have **left–right (or strip-wise) structure**; a sequence model can integrate **context** along that axis **bidirectionally**, which GAP does not model. The CNN–RNN is a **minimal, comparable** upgrade: same heads and training loop, different pooling.
+### Data: ArtGAN-aligned index and Hugging Face
+We align with the **ArtGAN / WikiArt** lineage so labels are **catalogue-consistent** for genre, style, and artist. We **trim** the index (drop broken/missing files, validate paths) and publish a curated dataset on the Hub (e.g. [`pdjota/artyset`](https://huggingface.co/datasets/pdjota/artyset)) so **training and demos are reproducible**.
+`scripts/train_cnn.py` uses **70% / 15% / 15%** train / val / test, **stratified by `artist_id`**. **Reasoning:** if we split randomly by image, we might put **almost all works of a rare artist** in one fold; **artist** is also the label that would most easily “leak” structurally (same brushwork in train vs test). Stratifying by artist keeps **each split’s artist mix** more representative, so validation/test **accuracy and loss** are comparable across runs and less dominated by **which artists** landed in which fold.
+### Training artifacts and this Space
+Runs save **PyTorch** checkpoints (`best.pt`, `last.pt`), **CSV** logs (`train_log.csv`, `results_summary.csv`), and we upload reference models with **`id2label` JSON** to Hub model repos. Training works on **CPU**, **Apple Silicon (MPS)**, or a **GPU Space**.
+**Gradio (this Space):** upload a painting and compare **CNN baseline** vs **CNN–RNN** top-k predictions. Env vars below select which Hub checkpoints to load.
 ## Env (optional)
 - `BASELINE_MODEL_REPO_ID` — default `pdjota/cnn-baseline`
 - `CNNRNN_MODEL_REPO_ID` — default `pdjota/arty-cnn-rnn`
 - `HF_TOKEN` — if model repos are gated
+More detail: [`gradio/README.md`](gradio/README.md), [`docs/monorepo_gradio_space.md`](docs/monorepo_gradio_space.md), research plan [`plan.md`](plan.md).
+### References (ResNet / transfer / WikiArt)
+1. He, K., Zhang, X., Ren, S., & Sun, J. (2016). *Deep residual learning for image recognition.* CVPR. [arXiv:1512.03385](https://arxiv.org/abs/1512.03385)
+2. Yosinski, J., Clune, J., Bengio, Y., & Lipson, H. (2014). *How transferable are features in deep neural networks?* NeurIPS. [arXiv:1411.1792](https://arxiv.org/abs/1411.1792)
+3. Zhao, W., Zhou, D., Qiu, X., & Jiang, W. (2021). *Compare the performance of the models in art classification.* PLOS ONE 16(3): e0248414. [DOI:10.1371/journal.pone.0248414](https://doi.org/10.1371/journal.pone.0248414)
+4. Lin, M., Chen, Q., & Yan, S. (2014). *Network in network.* ICLR. [arXiv:1312.4400](https://arxiv.org/abs/1312.4400) — global average pooling to aggregate conv feature maps before classification.

gradio/app.py CHANGED Viewed

@@ -18,7 +18,6 @@ from typing import Any, Dict, List, Optional, Tuple
 import gradio as gr
 import torch
-import torch.nn.functional as F
 from huggingface_hub import hf_hub_download
 from PIL import Image
 from torchvision import transforms as T
@@ -32,12 +31,19 @@ if not _SRC.exists():
     )
 sys.path.insert(0, str(REPO_ROOT / "src"))
 from model import ResNet50BiLSTMThreeHeads  # type: ignore
 from model import ResNet50ThreeHeads  # type: ignore
 BASELINE_REPO = os.environ.get("BASELINE_MODEL_REPO_ID", "pdjota/cnn-baseline")
 CNNRNN_REPO = os.environ.get("CNNRNN_MODEL_REPO_ID", "pdjota/arty-cnn-rnn")
-DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")
 HF_TOKEN = os.environ.get("HF_TOKEN")
 transform = T.Compose(
@@ -45,7 +51,7 @@ transform = T.Compose(
         T.Resize(256),
         T.CenterCrop(224),
         T.ToTensor(),
-        T.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
     ]
 )
@@ -109,11 +115,6 @@ def _load(repo_id: str) -> Dict[str, Any]:
 # --- prediction helpers ----------------------------------------------------
-def _topk(logits: torch.Tensor, id2label: Dict[int, str], k: int = 3) -> List[Dict[str, Any]]:
-    probs = F.softmax(logits, dim=-1)[0]
-    vals, idxs = probs.topk(k)
-    return [{"label": id2label.get(int(i), str(int(i))), "prob": round(float(v), 4)} for v, i in zip(vals, idxs)]
 def _bucket(pct: float) -> str:
     if pct >= 80:
@@ -151,12 +152,16 @@ def predict(model_choice: str, image: Optional[Image.Image]) -> Tuple[str, str]:
         model = assets["model"]
         x = transform(image).unsqueeze(0).to(DEVICE)
-        with torch.no_grad():
-            lg, ls, la = model(x)
-        g3 = _topk(lg, assets["genre"])
-        s3 = _topk(ls, assets["style"])
-        a3 = _topk(la, assets["artist"])
         summary = "\n".join([
             f"**Genre**: {_summarize(g3)}",

 import gradio as gr
 import torch
 from huggingface_hub import hf_hub_download
 from PIL import Image
 from torchvision import transforms as T
     )
 sys.path.insert(0, str(REPO_ROOT / "src"))
+from config import IMAGENET_MEAN, IMAGENET_STD  # type: ignore
 from model import ResNet50BiLSTMThreeHeads  # type: ignore
 from model import ResNet50ThreeHeads  # type: ignore
+from predict_format import topk_tuples_to_ui_items  # type: ignore
 BASELINE_REPO = os.environ.get("BASELINE_MODEL_REPO_ID", "pdjota/cnn-baseline")
 CNNRNN_REPO = os.environ.get("CNNRNN_MODEL_REPO_ID", "pdjota/arty-cnn-rnn")
+if torch.cuda.is_available():
+    DEVICE = torch.device("cuda")
+elif getattr(torch.backends, "mps", None) is not None and torch.backends.mps.is_available():
+    DEVICE = torch.device("mps")
+else:
+    DEVICE = torch.device("cpu")
 HF_TOKEN = os.environ.get("HF_TOKEN")
 transform = T.Compose(
         T.Resize(256),
         T.CenterCrop(224),
         T.ToTensor(),
+        T.Normalize(mean=IMAGENET_MEAN, std=IMAGENET_STD),
     ]
 )
 # --- prediction helpers ----------------------------------------------------
 def _bucket(pct: float) -> str:
     if pct >= 80:
         model = assets["model"]
         x = transform(image).unsqueeze(0).to(DEVICE)
+        g_t, s_t, a_t = model.predict_topk(
+            x,
+            genre_id2label=assets["genre"],
+            style_id2label=assets["style"],
+            artist_id2label=assets["artist"],
+            k=3,
+        )
+        g3 = topk_tuples_to_ui_items(g_t)
+        s3 = topk_tuples_to_ui_items(s_t)
+        a3 = topk_tuples_to_ui_items(a_t)
         summary = "\n".join([
             f"**Genre**: {_summarize(g3)}",

requirements.txt CHANGED Viewed

@@ -9,6 +9,7 @@ scikit-learn>=1.2
 matplotlib>=3.7
 tqdm>=4.65
 huggingface_hub>=0.25.0,<1.0  # Gradio 5.x needs HfFolder; removed in hub 1.0
 pytest>=7.0          # tests
 pytest-cov>=4.0      # coverage

 matplotlib>=3.7
 tqdm>=4.65
 huggingface_hub>=0.25.0,<1.0  # Gradio 5.x needs HfFolder; removed in hub 1.0
+python-dotenv>=1.0   # optional: HF_TOKEN from repo .env for upload scripts
 pytest>=7.0          # tests
 pytest-cov>=4.0      # coverage

scripts/spot_check_excluded_post_impressionism.py ADDED Viewed

	@@ -0,0 +1,137 @@

+"""
+Spot-check the CNN baseline on fixed excluded Post_Impressionism images (README table).
+Images live under data/wikiart_excluded/Post_Impressionism/ (not in the training index).
+Uses the same eval transforms as training (`train_cnn.get_transforms(train=False)`).
+Usage (from repo root):
+    python scripts/spot_check_excluded_post_impressionism.py
+    python scripts/spot_check_excluded_post_impressionism.py --cpu
+    python scripts/spot_check_excluded_post_impressionism.py --top-k 5
+"""
+from __future__ import annotations
+import argparse
+import importlib.util
+import json
+import sys
+from pathlib import Path
+import torch
+ROOT = Path(__file__).resolve().parent.parent
+sys.path.insert(0, str(ROOT / "src"))
+from config import checkpoint_dir_for_arch
+from model import ResNet50ThreeHeads
+DEFAULT_REL_PATHS: tuple[str, ...] = (
+    "henri-matisse_a-vase-with-oranges.jpg",
+    "henri-de-toulouse-lautrec_portrait-of-vincent-van-gogh-1887.jpg",
+    "pablo-picasso_seated-monkey-1905.jpg",
+    "paul-gauguin_a-seashore-1887.jpg",
+    "a.y.-jackson_the-edge-of-the-maple-wood-1910.jpg",
+)
+LABEL_MAPS_DIR = ROOT / "data" / "label_maps"
+EXCLUDED_STYLE_DIR = ROOT / "data" / "wikiart_excluded" / "Post_Impressionism"
+def _load_train_cnn():
+    spec = importlib.util.spec_from_file_location("train_cnn", ROOT / "scripts" / "train_cnn.py")
+    mod = importlib.util.module_from_spec(spec)
+    assert spec.loader is not None
+    spec.loader.exec_module(mod)
+    return mod
+def load_id2label(path: Path) -> dict[int, str]:
+    with open(path, encoding="utf-8") as f:
+        return {int(k): v for k, v in json.load(f).items()}
+def load_label_maps() -> tuple[dict[int, str], dict[int, str], dict[int, str]]:
+    return (
+        load_id2label(LABEL_MAPS_DIR / "genre_id2label.json"),
+        load_id2label(LABEL_MAPS_DIR / "style_id2label.json"),
+        load_id2label(LABEL_MAPS_DIR / "artist_id2label.json"),
+    )
+def resolve_device(*, force_cpu: bool) -> torch.device:
+    if force_cpu:
+        return torch.device("cpu")
+    if torch.cuda.is_available():
+        return torch.device("cuda")
+    if getattr(torch.backends, "mps", None) is not None and torch.backends.mps.is_available():
+        return torch.device("mps")
+    return torch.device("cpu")
+def main() -> None:
+    p = argparse.ArgumentParser(description="CNN spot-check on excluded Post_Impressionism examples.")
+    p.add_argument("--cpu", action="store_true", help="Force CPU")
+    p.add_argument(
+        "--checkpoint",
+        type=Path,
+        default=None,
+        help="Path to best.pt (default: checkpoints/<cnn>/best.pt from config)",
+    )
+    p.add_argument("--top-k", type=int, default=3, metavar="K", help="Top-k per head (default: 3)")
+    args = p.parse_args()
+    if args.top_k < 1:
+        print("ERROR: --top-k must be >= 1", file=sys.stderr)
+        sys.exit(1)
+    device = resolve_device(force_cpu=args.cpu)
+    ckpt_path = args.checkpoint if args.checkpoint is not None else checkpoint_dir_for_arch("cnn") / "best.pt"
+    if not ckpt_path.exists():
+        print(f"ERROR: checkpoint not found: {ckpt_path}", file=sys.stderr)
+        sys.exit(1)
+    genre_map, style_map, artist_map = load_label_maps()
+    ckpt = torch.load(ckpt_path, map_location=device, weights_only=False)
+    n_genre = ckpt["n_genre"]
+    n_style = ckpt["n_style"]
+    n_artist = ckpt["n_artist"]
+    model = ResNet50ThreeHeads(n_genre=n_genre, n_style=n_style, n_artist=n_artist, weights=None)
+    model.load_state_dict(ckpt["model_state_dict"])
+    model.to(device)
+    train_cnn = _load_train_cnn()
+    transform = train_cnn.get_transforms(train=False)
+    paths = [EXCLUDED_STYLE_DIR / name for name in DEFAULT_REL_PATHS]
+    print(f"Checkpoint: {ckpt_path}")
+    print(f"Device: {device}")
+    print(f"Top-k: {args.top_k}")
+    post_ids = [k for k, v in style_map.items() if v == "Post_Impressionism"]
+    print(f"Post_Impressionism style id(s): {post_ids}")
+    print()
+    for path in paths:
+        if not path.exists():
+            print(f"MISSING: {path}")
+            continue
+        g, s, a = model.predict_topk_from_path(
+            path,
+            transform,
+            device,
+            genre_id2label=genre_map,
+            style_id2label=style_map,
+            artist_id2label=artist_map,
+            k=args.top_k,
+        )
+        print("=" * 72)
+        print(path.name)
+        print("  genre (top-%d):" % args.top_k, g)
+        print("  style (top-%d):" % args.top_k, s)
+        print("  artist (top-%d):" % args.top_k, a)
+if __name__ == "__main__":
+    main()

scripts/train_cnn.py CHANGED Viewed

@@ -26,6 +26,15 @@ from torchvision import transforms as T
 from sklearn.model_selection import train_test_split
 ROOT = Path(__file__).resolve().parent.parent
 sys.path.insert(0, str(ROOT / "src"))
 from config import (
@@ -134,7 +143,18 @@ def main() -> None:
     resume_path = ckpt_dir / "last.pt"
     if args.resume and resume_path.exists():
-        ckpt = torch.load(resume_path, map_location=device)
         start_epoch = ckpt["epoch"] + 1
         best_val_loss = ckpt.get("val_loss", float("inf"))
         extra = args.epochs if args.epochs is not None else 10
@@ -275,9 +295,9 @@ def main() -> None:
                 "n_style": N_STYLE,
                 "n_artist": N_ARTIST,
             }
-            torch.save(ckpt, ckpt_dir / "last.pt")
             if is_best:
-                torch.save(ckpt, ckpt_dir / "best.pt")
             log_row = {
                 "epoch": epoch,
@@ -315,7 +335,7 @@ def main() -> None:
             "batch_in_epoch": current_batch_in_epoch,
             "num_batches_in_epoch": current_num_batches_in_epoch,
         }
-        torch.save(interrupted_ckpt, ckpt_dir / "last.pt")
         print(
             "\n"
             f"[{now_ts()}] Stopped by user (Ctrl+C). Saved resumable checkpoint to "
@@ -327,7 +347,7 @@ def main() -> None:
     # Save best-val results summary
     best_ckpt_path = ckpt_dir / "best.pt"
     if best_ckpt_path.exists():
-        best_ckpt = torch.load(best_ckpt_path, map_location="cpu")
         summary = {
             "best_epoch": best_ckpt.get("epoch"),
             "val_loss": best_ckpt.get("val_loss"),

 from sklearn.model_selection import train_test_split
 ROOT = Path(__file__).resolve().parent.parent
+def _atomic_torch_save(obj: object, path: Path) -> None:
+    """Write `path` via a temp file + `os.replace` so a kill mid-write does not truncate `last.pt` / `best.pt`."""
+    path = Path(path)
+    path.parent.mkdir(parents=True, exist_ok=True)
+    tmp = path.with_suffix(path.suffix + ".tmp")
+    torch.save(obj, tmp)
+    os.replace(tmp, path)
 sys.path.insert(0, str(ROOT / "src"))
 from config import (
     resume_path = ckpt_dir / "last.pt"
     if args.resume and resume_path.exists():
+        try:
+            ckpt = torch.load(resume_path, map_location=device, weights_only=False)
+        except Exception as e:
+            print(
+                f"[{now_ts()}] ERROR: Cannot load {resume_path} (often a truncated file if the process was killed "
+                f"during `torch.save`).\n"
+                f"  {e}\n"
+                f"  Fix: if best.pt is intact, copy it over last.pt and resume, e.g.\n"
+                f"    cp {ckpt_dir / 'best.pt'} {resume_path}",
+                file=sys.stderr,
+            )
+            sys.exit(1)
         start_epoch = ckpt["epoch"] + 1
         best_val_loss = ckpt.get("val_loss", float("inf"))
         extra = args.epochs if args.epochs is not None else 10
                 "n_style": N_STYLE,
                 "n_artist": N_ARTIST,
             }
+            _atomic_torch_save(ckpt, ckpt_dir / "last.pt")
             if is_best:
+                _atomic_torch_save(ckpt, ckpt_dir / "best.pt")
             log_row = {
                 "epoch": epoch,
             "batch_in_epoch": current_batch_in_epoch,
             "num_batches_in_epoch": current_num_batches_in_epoch,
         }
+        _atomic_torch_save(interrupted_ckpt, ckpt_dir / "last.pt")
         print(
             "\n"
             f"[{now_ts()}] Stopped by user (Ctrl+C). Saved resumable checkpoint to "
     # Save best-val results summary
     best_ckpt_path = ckpt_dir / "best.pt"
     if best_ckpt_path.exists():
+        best_ckpt = torch.load(best_ckpt_path, map_location="cpu", weights_only=False)
         summary = {
             "best_epoch": best_ckpt.get("epoch"),
             "val_loss": best_ckpt.get("val_loss"),

scripts/upload_model_to_hf.py CHANGED Viewed

@@ -1,28 +1,41 @@
 """
-Upload a trained checkpoint to the Hugging Face Hub (model repo) and (optionally)
-export id→label JSON files locally.
-The Space/demo code uses these JSONs to turn predicted class indices into readable labels.
-Usage:
-  HF_TOKEN=... python scripts/upload_model_to_hf.py --repo-id USERNAME/arty-cnn-baseline
-  HF_TOKEN=... python scripts/upload_model_to_hf.py --repo-id USERNAME/arty-cnn-baseline --export-labels-dir data/label_maps
 """
 import argparse
 import json
 import sys
 from pathlib import Path
 import torch
 ROOT = Path(__file__).resolve().parent.parent
-sys.path.insert(0, str(ROOT / "src"))
-from config import checkpoint_dir_for_arch  # noqa: E402
 DATA_DIR = ROOT / "data"
 INDEX_SELECTED = DATA_DIR / "wikiart_index_selected.csv"
-CHECKPOINT_DEFAULT = checkpoint_dir_for_arch("cnn") / "best.pt"
 def build_id2label_from_selected_index(index_path: Path) -> tuple[dict[str, str], dict[str, str], dict[str, str]]:
@@ -101,22 +114,34 @@ def upload_checkpoint_and_labels(
 def main() -> None:
-    p = argparse.ArgumentParser(description="Upload model checkpoint + id2label JSONs to Hugging Face Hub")
     p.add_argument("--repo-id", required=True, help="Model repo id, e.g. username/arty-cnn-baseline")
     p.add_argument(
         "--checkpoint",
         type=Path,
-        default=CHECKPOINT_DEFAULT,
-        help=f"Checkpoint path (default: {CHECKPOINT_DEFAULT})",
     )
-    p.add_argument("--token", default=None, help="HF token (default: HF_TOKEN env)")
     p.add_argument("--index", type=Path, default=INDEX_SELECTED, help="Selected index CSV (default: data/wikiart_index_selected.csv)")
-    p.add_argument("--export-labels-dir", type=Path, default=None, help="Optional dir to write *_id2label.json locally")
     args = p.parse_args()
-    token = args.token or __import__("os").environ.get("HF_TOKEN")
     if not token:
-        print("Set HF_TOKEN or pass --token", file=sys.stderr)
         sys.exit(1)
     # quick sanity load of checkpoint format

 """
+Upload a trained checkpoint and id→label JSONs to a Hugging Face model repo (for Spaces / demos).
+Usage (repo root):
+  python scripts/upload_model_to_hf.py --repo-id USER/reponame --checkpoint PATH/TO/best.pt
+`HF_TOKEN`: repo `.env` (python-dotenv) wins over an existing shell `HF_TOKEN` when the key appears in `.env`
+(`load_dotenv(..., override=True)`). Use a token with **write** access to the model repo. Local labels: `data/label_maps/`.
 """
 import argparse
 import json
+import os
 import sys
 from pathlib import Path
 import torch
 ROOT = Path(__file__).resolve().parent.parent
+def _load_dotenv_from_repo() -> None:
+    """Load repo `.env` into os.environ. Keys in `.env` override the same keys already in the environment
+    (fixes stale HF_TOKEN from the shell or IDE masking a valid token in `.env`)."""
+    env_path = ROOT / ".env"
+    if not env_path.is_file():
+        return
+    try:
+        from dotenv import load_dotenv
+    except ImportError:
+        return
+    load_dotenv(env_path, override=True)
 DATA_DIR = ROOT / "data"
 INDEX_SELECTED = DATA_DIR / "wikiart_index_selected.csv"
+LABEL_EXPORT_DEFAULT = DATA_DIR / "label_maps"
 def build_id2label_from_selected_index(index_path: Path) -> tuple[dict[str, str], dict[str, str], dict[str, str]]:
 def main() -> None:
+    _load_dotenv_from_repo()
+    p = argparse.ArgumentParser(
+        description="Upload model checkpoint + id2label JSONs to Hugging Face Hub. "
+        "Loads repo-root .env by default (HF_TOKEN) when python-dotenv is installed."
+    )
     p.add_argument("--repo-id", required=True, help="Model repo id, e.g. username/arty-cnn-baseline")
     p.add_argument(
         "--checkpoint",
         type=Path,
+        required=True,
+        help="Checkpoint file to upload (e.g. checkpoints/cnn_baseline/best.pt or checkpoints/cnnrnn/best.pt)",
     )
     p.add_argument("--index", type=Path, default=INDEX_SELECTED, help="Selected index CSV (default: data/wikiart_index_selected.csv)")
+    p.add_argument(
+        "--export-labels-dir",
+        type=Path,
+        default=LABEL_EXPORT_DEFAULT,
+        help=f"Write *_id2label.json here (default: {LABEL_EXPORT_DEFAULT})",
+    )
     args = p.parse_args()
+    token = os.environ.get("HF_TOKEN")
     if not token:
+        print(
+            "Missing token: add HF_TOKEN to repo-root .env",
+            file=sys.stderr,
+        )
         sys.exit(1)
     # quick sanity load of checkpoint format

src/model.py CHANGED Viewed

@@ -1,10 +1,72 @@
 """ResNet-50 backbone variants for multi-task classification."""
 import torch
 import torch.nn as nn
 from torchvision.models import ResNet50_Weights, resnet50
-class ResNet50ThreeHeads(nn.Module):
     """ResNet-50 (ImageNet pretrained), GAP, then three linear heads: genre, style, artist."""
     def __init__(
@@ -58,7 +120,7 @@ class ResNet50ThreeHeads(nn.Module):
         )
-class ResNet50BiLSTMThreeHeads(nn.Module):
     """
     ResNet-50 (ImageNet pretrained) feature map -> column pooling -> BiLSTM -> mean pool -> three heads.

 """ResNet-50 backbone variants for multi-task classification."""
+from __future__ import annotations
+from pathlib import Path
 import torch
 import torch.nn as nn
+import torch.nn.functional as F
 from torchvision.models import ResNet50_Weights, resnet50
+def _topk_from_logits(
+    logits: torch.Tensor, id2label: dict[int, str], k: int
+) -> list[tuple[str, float]]:
+    """Map top-k softmax probabilities to label names (batch size 1)."""
+    n = logits.size(-1)
+    k = min(k, n)
+    probs = F.softmax(logits, dim=-1)[0]
+    top = probs.topk(k)
+    return [(id2label[int(i)], float(p)) for i, p in zip(top.indices.tolist(), top.values.tolist())]
+class _ThreeHeadPredictMixin:
+    """Top-k decoding for genre / style / artist heads (shared by CNN and CNN–RNN)."""
+    def predict_topk(
+        self,
+        x: torch.Tensor,
+        *,
+        genre_id2label: dict[int, str],
+        style_id2label: dict[int, str],
+        artist_id2label: dict[int, str],
+        k: int = 3,
+    ) -> tuple[list[tuple[str, float]], list[tuple[str, float]], list[tuple[str, float]]]:
+        self.eval()
+        with torch.no_grad():
+            lg, ls, la = self(x)
+        return (
+            _topk_from_logits(lg, genre_id2label, k),
+            _topk_from_logits(ls, style_id2label, k),
+            _topk_from_logits(la, artist_id2label, k),
+        )
+    def predict_topk_from_path(
+        self,
+        path: Path | str,
+        transform: torch.nn.Module,
+        device: torch.device,
+        *,
+        genre_id2label: dict[int, str],
+        style_id2label: dict[int, str],
+        artist_id2label: dict[int, str],
+        k: int = 3,
+    ) -> tuple[list[tuple[str, float]], list[tuple[str, float]], list[tuple[str, float]]]:
+        from PIL import Image
+        p = Path(path)
+        img = Image.open(p).convert("RGB")
+        x = transform(img).unsqueeze(0).to(device)
+        return self.predict_topk(
+            x,
+            genre_id2label=genre_id2label,
+            style_id2label=style_id2label,
+            artist_id2label=artist_id2label,
+            k=k,
+        )
+class ResNet50ThreeHeads(_ThreeHeadPredictMixin, nn.Module):
     """ResNet-50 (ImageNet pretrained), GAP, then three linear heads: genre, style, artist."""
     def __init__(
         )
+class ResNet50BiLSTMThreeHeads(_ThreeHeadPredictMixin, nn.Module):
     """
     ResNet-50 (ImageNet pretrained) feature map -> column pooling -> BiLSTM -> mean pool -> three heads.

src/predict_format.py ADDED Viewed

	@@ -0,0 +1,9 @@

+"""Format `model.predict_topk` tuples for Gradio summaries and JSON."""
+from __future__ import annotations
+from typing import Any
+def topk_tuples_to_ui_items(rows: list[tuple[str, float]]) -> list[dict[str, Any]]:
+    """Match legacy Gradio `_topk` shape: `label` + `prob` rounded to 4 decimals."""
+    return [{"label": label, "prob": round(float(prob), 4)} for label, prob in rows]

tests/test_model_architectures.py CHANGED Viewed

@@ -2,6 +2,8 @@ import sys
 from pathlib import Path
 import torch
 def test_resnet50_three_heads_forward_shapes_no_weights() -> None:
@@ -17,6 +19,52 @@ def test_resnet50_three_heads_forward_shapes_no_weights() -> None:
     assert a.shape == (2, 23)
 def test_resnet50_bilstm_three_heads_forward_shapes_no_weights() -> None:
     root = Path(__file__).resolve().parent.parent
     sys.path.insert(0, str(root / "src"))

 from pathlib import Path
 import torch
+from PIL import Image
+from torchvision import transforms as T
 def test_resnet50_three_heads_forward_shapes_no_weights() -> None:
     assert a.shape == (2, 23)
+def test_resnet50_three_heads_predict_topk() -> None:
+    root = Path(__file__).resolve().parent.parent
+    sys.path.insert(0, str(root / "src"))
+    from model import ResNet50ThreeHeads
+    model = ResNet50ThreeHeads(n_genre=10, n_style=27, n_artist=23, weights=None)
+    x = torch.randn(1, 3, 224, 224)
+    gmap = {i: f"g{i}" for i in range(10)}
+    smap = {i: f"s{i}" for i in range(27)}
+    amap = {i: f"a{i}" for i in range(23)}
+    g, s, a = model.predict_topk(
+        x,
+        genre_id2label=gmap,
+        style_id2label=smap,
+        artist_id2label=amap,
+        k=3,
+    )
+    assert len(g) == len(s) == len(a) == 3
+    assert all(isinstance(name, str) and 0.0 <= p <= 1.0 for name, p in g + s + a)
+def test_resnet50_three_heads_predict_topk_from_path(tmp_path: Path) -> None:
+    root = Path(__file__).resolve().parent.parent
+    sys.path.insert(0, str(root / "src"))
+    from model import ResNet50ThreeHeads
+    img_path = tmp_path / "x.jpg"
+    Image.new("RGB", (256, 256), color=(120, 80, 40)).save(img_path, format="JPEG")
+    model = ResNet50ThreeHeads(n_genre=10, n_style=27, n_artist=23, weights=None)
+    device = torch.device("cpu")
+    transform = T.Compose([T.Resize(256), T.CenterCrop(224), T.ToTensor()])
+    gmap = {i: f"g{i}" for i in range(10)}
+    smap = {i: f"s{i}" for i in range(27)}
+    amap = {i: f"a{i}" for i in range(23)}
+    g, s, a = model.predict_topk_from_path(
+        img_path,
+        transform,
+        device,
+        genre_id2label=gmap,
+        style_id2label=smap,
+        artist_id2label=amap,
+    )
+    assert len(g) == len(s) == len(a) == 3
 def test_resnet50_bilstm_three_heads_forward_shapes_no_weights() -> None:
     root = Path(__file__).resolve().parent.parent
     sys.path.insert(0, str(root / "src"))

tests/test_predict_format.py ADDED Viewed

	@@ -0,0 +1,12 @@

+from pathlib import Path
+import sys
+ROOT = Path(__file__).resolve().parent.parent
+def test_topk_tuples_to_ui_items_rounding() -> None:
+    sys.path.insert(0, str(ROOT / "src"))
+    from predict_format import topk_tuples_to_ui_items
+    out = topk_tuples_to_ui_items([("a", 0.123456789), ("b", 0.5)])
+    assert out == [{"label": "a", "prob": 0.1235}, {"label": "b", "prob": 0.5}]

tests/test_spot_check_excluded_post_impressionism.py ADDED Viewed

	@@ -0,0 +1,47 @@

+"""Tests for scripts/spot_check_excluded_post_impressionism.py helpers."""
+from __future__ import annotations
+import importlib.util
+import json
+import sys
+from pathlib import Path
+import torch
+ROOT = Path(__file__).resolve().parent.parent
+SCRIPT = ROOT / "scripts" / "spot_check_excluded_post_impressionism.py"
+def _load_module():
+    spec = importlib.util.spec_from_file_location("spot_check_excluded_post_impressionism", SCRIPT)
+    mod = importlib.util.module_from_spec(spec)
+    assert spec.loader is not None
+    sys.modules["spot_check_excluded_post_impressionism"] = mod
+    spec.loader.exec_module(mod)
+    return mod
+def test_load_id2label_roundtrip(tmp_path: Path) -> None:
+    mod = _load_module()
+    p = tmp_path / "m.json"
+    p.write_text(json.dumps({"0": "foo", "1": "bar"}), encoding="utf-8")
+    assert mod.load_id2label(p) == {0: "foo", 1: "bar"}
+def test_default_rel_paths_count() -> None:
+    mod = _load_module()
+    assert len(mod.DEFAULT_REL_PATHS) == 5
+def test_resolve_device_force_cpu() -> None:
+    mod = _load_module()
+    assert mod.resolve_device(force_cpu=True).type == "cpu"
+def test_load_label_maps_reads_repo_json() -> None:
+    mod = _load_module()
+    if not (mod.LABEL_MAPS_DIR / "genre_id2label.json").exists():
+        return
+    g, s, a = mod.load_label_maps()
+    assert "Post_Impressionism" in s.values()
+    assert len(g) >= 1 and len(a) >= 1

tests/test_train_cnn_atomic_save.py ADDED Viewed

	@@ -0,0 +1,33 @@

+"""Tests for scripts/train_cnn.py checkpoint atomic save helper."""
+from __future__ import annotations
+import importlib.util
+import sys
+from pathlib import Path
+import torch
+ROOT = Path(__file__).resolve().parent.parent
+SCRIPT = ROOT / "scripts" / "train_cnn.py"
+def _load_train_cnn():
+    spec = importlib.util.spec_from_file_location("train_cnn", SCRIPT)
+    mod = importlib.util.module_from_spec(spec)
+    assert spec.loader is not None
+    sys.modules["train_cnn"] = mod
+    spec.loader.exec_module(mod)
+    return mod
+def test_atomic_torch_save_roundtrip(tmp_path: Path) -> None:
+    mod = _load_train_cnn()
+    path = tmp_path / "ckpt.pt"
+    mod._atomic_torch_save({"k": 42, "t": torch.zeros(2)}, path)
+    assert path.is_file()
+    assert not path.with_suffix(".pt.tmp").exists()
+    data = torch.load(path, map_location="cpu", weights_only=False)
+    assert data["k"] == 42
+    assert list(data["t"].shape) == [2]

tests/test_upload_model_to_hf.py CHANGED Viewed

@@ -1,6 +1,7 @@
 """Tests for scripts/upload_model_to_hf.py"""
 import importlib.util
 import sys
 from pathlib import Path
 from unittest.mock import MagicMock, patch
@@ -16,6 +17,24 @@ sys.modules["upload_model_to_hf"] = mod
 spec.loader.exec_module(mod)
 def test_build_id2label_from_selected_index(tmp_path: Path) -> None:
     index = tmp_path / "index.csv"
     pd.DataFrame(

 """Tests for scripts/upload_model_to_hf.py"""
 import importlib.util
+import os
 import sys
 from pathlib import Path
 from unittest.mock import MagicMock, patch
 spec.loader.exec_module(mod)
+def test_load_dotenv_from_repo_sets_hf_token(tmp_path: Path, monkeypatch) -> None:
+    monkeypatch.delenv("HF_TOKEN", raising=False)
+    (tmp_path / ".env").write_text("HF_TOKEN=fake_from_dotenv\n", encoding="utf-8")
+    with patch.object(mod, "ROOT", tmp_path):
+        mod._load_dotenv_from_repo()
+    assert os.environ.get("HF_TOKEN") == "fake_from_dotenv"
+def test_load_dotenv_from_repo_overrides_stale_hf_token(tmp_path: Path, monkeypatch) -> None:
+    monkeypatch.setenv("HF_TOKEN", "stale_wrong_token")
+    (tmp_path / ".env").write_text("HF_TOKEN=good_from_dotenv\n", encoding="utf-8")
+    with patch.object(mod, "ROOT", tmp_path):
+        mod._load_dotenv_from_repo()
+    assert os.environ.get("HF_TOKEN") == "good_from_dotenv"
 def test_build_id2label_from_selected_index(tmp_path: Path) -> None:
     index = tmp_path / "index.csv"
     pd.DataFrame(