aminasifar1 commited on 13 days ago

Commit

04f866d

verified ·

1 Parent(s): 599969f

Upload folder using huggingface_hub

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

README.md +248 -0
__pycache__/inference.cpython-313.pyc +0 -0
configs/spai.yaml +54 -0
inference.py +208 -0
requirements.txt +37 -0
spai/__init__.py +15 -0
spai/__main__.py +208 -0
spai/__pycache__/__init__.cpython-310.pyc +0 -0
spai/__pycache__/__init__.cpython-313.pyc +0 -0
spai/__pycache__/__main__.cpython-310.pyc +0 -0
spai/__pycache__/__main__.cpython-313.pyc +0 -0
spai/__pycache__/config.cpython-310.pyc +0 -0
spai/__pycache__/config.cpython-313.pyc +0 -0
spai/__pycache__/data_utils.cpython-310.pyc +0 -0
spai/__pycache__/data_utils.cpython-313.pyc +0 -0
spai/__pycache__/logger.cpython-310.pyc +0 -0
spai/__pycache__/logger.cpython-313.pyc +0 -0
spai/__pycache__/lr_scheduler.cpython-310.pyc +0 -0
spai/__pycache__/lr_scheduler.cpython-313.pyc +0 -0
spai/__pycache__/metrics.cpython-310.pyc +0 -0
spai/__pycache__/metrics.cpython-313.pyc +0 -0
spai/__pycache__/onnx.cpython-310.pyc +0 -0
spai/__pycache__/onnx.cpython-313.pyc +0 -0
spai/__pycache__/optimizer.cpython-310.pyc +0 -0
spai/__pycache__/optimizer.cpython-313.pyc +0 -0
spai/__pycache__/utils.cpython-310.pyc +0 -0
spai/__pycache__/utils.cpython-313.pyc +0 -0
spai/config.py +494 -0
spai/data/__init__.py +26 -0
spai/data/__pycache__/__init__.cpython-310.pyc +0 -0
spai/data/__pycache__/__init__.cpython-313.pyc +0 -0
spai/data/__pycache__/blur_kernels.cpython-310.pyc +0 -0
spai/data/__pycache__/blur_kernels.cpython-313.pyc +0 -0
spai/data/__pycache__/data_finetune.cpython-310.pyc +0 -0
spai/data/__pycache__/data_finetune.cpython-313.pyc +0 -0
spai/data/__pycache__/data_mfm.cpython-310.pyc +0 -0
spai/data/__pycache__/data_mfm.cpython-313.pyc +0 -0
spai/data/__pycache__/filestorage.cpython-310.pyc +0 -0
spai/data/__pycache__/filestorage.cpython-313.pyc +0 -0
spai/data/__pycache__/random_degradations.cpython-310.pyc +0 -0
spai/data/__pycache__/random_degradations.cpython-313.pyc +0 -0
spai/data/__pycache__/readers.cpython-310.pyc +0 -0
spai/data/__pycache__/readers.cpython-313.pyc +0 -0
spai/data/blur_kernels.py +539 -0
spai/data/data_finetune.py +723 -0
spai/data/data_mfm.py +131 -0
spai/data/filestorage.py +387 -0
spai/data/random_degradations.py +462 -0
spai/data/readers.py +178 -0
spai/data_utils.py +50 -0

README.md ADDED Viewed

	@@ -0,0 +1,248 @@

+# SPAI: Spectral AI-Generated Image Detector
+Official repository for the CVPR 2025 paper:
+Any-Resolution AI-Generated Image Detection by Spectral Learning.
+SPAI learns the spectral distribution of real images and detects AI-generated
+images as out-of-distribution samples using spectral reconstruction similarity.
+## Repository Status
+This repository currently contains:
+- Core SPAI package in `spai/`.
+- Main config in `configs/spai.yaml`.
+- A trained checkpoint in `spai/weights/spai.pth`.
+- Unit tests in `tests/`.
+- Utility scripts for data prep, crawling, Fourier analysis and reporting in `tools/` and `spai/tools/`.
+- Hugging Face inference handler in `inference.py`.
+## Project Structure
+```text
+.
+├── configs/
+│   └── spai.yaml
+├── spai/
+│   ├── data/                 # datasets, readers, augmentations, filestorage (LMDB)
+│   ├── models/               # backbones, SID, MFM, losses, filters
+│   ├── tools/                # CSV generation and dataset utilities
+│   ├── weights/
+│   │   └── spai.pth          # included checkpoint
+│   ├── config.py             # yacs configuration
+│   ├── hf_utils.py           # Hugging Face Hub upload/model card helpers
+│   ├── main_mfm.py           # MFM pretraining entrypoint
+│   └── ...
+├── tests/
+│   ├── data/
+│   └── models/
+├── tools/                    # analysis, crawling, preprocessing, HF execution logs
+├── inference.py              # HF EndpointHandler + local single-image inference
+└── requirements.txt
+```
+## Installation
+Recommended environment:
+```bash
+conda create -n spai python=3.11
+conda activate spai
+conda install pytorch torchvision torchaudio pytorch-cuda=12.4 -c pytorch -c nvidia
+pip install -r requirements.txt
+```
+Notes:
+- Training code may require NVIDIA APEX.
+- `requirements.txt` includes packages for training, inference, ONNX, crawling and Hugging Face utilities.
+## Configuration and Weights
+- Main config: `configs/spai.yaml`
+- Default included checkpoint: `spai/weights/spai.pth`
+The included config is set for SID finetuning/inference with arbitrary-resolution processing.
+## Inference
+### 1) Hugging Face Endpoint Handler (recommended)
+File: `inference.py`
+The `EndpointHandler` supports these input formats:
+- image URL (`http://...` / `https://...`)
+- local path
+- base64 string
+- raw bytes
+- PIL image
+- dict with one of keys: `url`, `path`, `b64`, `bytes`
+Output format:
+```json
+{
+  "score": 0.8732,
+  "predicted_label": 1,
+  "predicted_label_name": "ai-generated",
+  "threshold": 0.5
+}
+```
+Label convention used in repository tooling:
+- `0` -> real
+- `1` -> ai-generated
+Run locally:
+```bash
+python inference.py --image "/path/to/image.jpg" --model-dir .
+```
+Environment overrides:
+- `SPAI_THRESHOLD` (default `0.5`)
+- `SPAI_CONFIG` (custom config path)
+- `SPAI_CHECKPOINT` (custom checkpoint path)
+- `SPAI_FORCE_CPU=1` (force CPU)
+### 2) Python usage
+```python
+from inference import EndpointHandler
+handler = EndpointHandler(path=".")
+result = handler({"inputs": "https://example.com/image.jpg"})
+print(result)
+```
+## Training
+### MFM pretraining entrypoint
+Use:
+```bash
+python spai/main_mfm.py --cfg configs/spai.yaml --data-path /path/to/data.csv --output output/mfm
+```
+`spai/main_mfm.py` also supports optional Hugging Face push flags:
+- `--push-to-hub`
+- `--hub-repo-id`
+- `--hub-token`
+- `--hub-create-model-card`
+## Dataset CSV Format
+Core dataset readers in `spai/data/data_finetune.py` expect CSVs with at least:
+- `image`: image path
+- `class`: class id
+- `split`: one of `train`, `val`, `test`
+Paths are resolved relatively to a configurable CSV root directory.
+## LMDB Dataset File Storage
+Module: `spai/data/filestorage.py`
+Available commands:
+```bash
+python spai/data/filestorage.py add-csv --help
+python spai/data/filestorage.py add-db --help
+python spai/data/filestorage.py verify-csv --help
+python spai/data/filestorage.py list-db --help
+```
+Use this workflow when you want to package many files into LMDB for faster or centralized IO.
+## Utility Scripts
+### Repository-level tools (`tools/`)
+- `tools/simple_crawler.py`: crawl and download images with metadata.
+- `tools/web_image_crawler.py`: crawl URLs/CSVs, download images, filter ad-like images.
+- `tools/image_quality_processor.py`: quality filtering, deduplication and reports.
+- `tools/preprocess_for_spai.py`: image preprocessing before SPAI.
+- `tools/create_spai_metadata.py`: build metadata CSV from an image folder.
+- `tools/extract_fourier_features.py`: compute Fourier-derived features.
+- `tools/visualize_fourier.py`: Fourier spectrum visualizations.
+- `tools/visualize_noise_decomposition.py`: advanced noise decomposition visualizations.
+- `tools/analyze_spai_results.py`: plots/analysis for prediction results.
+- `tools/analyze_normalization_impact.py`: study resize normalization impact.
+- `tools/hf_log_execution.py`: generate execution artifacts and optionally upload to HF datasets.
+Example:
+```bash
+python tools/hf_log_execution.py --results-csv output/preds.csv --output-dir output/hf_artifacts
+```
+### Package tools (`spai/tools/`)
+- `spai.tools.create_dir_csv`: create train/val/test CSV from directories.
+- `spai.tools.create_dmid_ldm_train_val_csv`: create DMID/LDM training CSV.
+- `spai.tools.augment_dataset`: augment a dataset and export updated CSV.
+- `spai.tools.reduce_csv_column`: conditional column reduction/aggregation.
+- `spai/tools/create_synthbuster_csv.py`: Synthbuster CSV generation utility.
+Examples:
+```bash
+python -m spai.tools.create_dir_csv --help
+python -m spai.tools.create_dmid_ldm_train_val_csv --help
+python -m spai.tools.augment_dataset --help
+python -m spai.tools.reduce_csv_column --help
+```
+For `create_synthbuster_csv.py`, use a `PYTHONPATH` that includes `spai/` due its import style:
+```bash
+PYTHONPATH=spai python spai/tools/create_synthbuster_csv.py --help
+```
+## Tests
+Run all tests:
+```bash
+pytest tests -q
+```
+Current test folders:
+- `tests/data/`
+- `tests/models/`
+## Acknowledgments
+This work was partly supported by Horizon Europe projects ELIAS and vera.ai,
+and computational resources from GRNET.
+Parts of the implementation build upon ideas/code from:
+https://github.com/Jiahao000/MFM
+## License
+Source code is licensed under Apache 2.0.
+Third-party datasets and dependencies keep their own licenses.
+## Contact
+For questions: d.karageorgiou@uva.nl
+## Citation
+```text
+@inproceedings{karageorgiou2025any,
+  title={Any-resolution ai-generated image detection by spectral learning},
+  author={Karageorgiou, Dimitrios and Papadopoulos, Symeon and Kompatsiaris, Ioannis and Gavves, Efstratios},
+  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
+  pages={18706--18717},
+  year={2025}
+}
+```

__pycache__/inference.cpython-313.pyc ADDED Viewed

Binary file (12.7 kB). View file

configs/spai.yaml ADDED Viewed

	@@ -0,0 +1,54 @@

+MODEL:
+  SID_APPROACH: "freq_restoration"
+  TYPE: vit
+  NAME: finetune
+  DROP_PATH_RATE: 0.1
+  NUM_CLASSES: 2
+  REQUIRED_NORMALIZATION: "positive_0_1"
+  RESOLUTION_MODE: "arbitrary"
+  FEATURE_EXTRACTION_BATCH: 400
+  VIT:
+    EMBED_DIM: 768
+    DEPTH: 12
+    NUM_HEADS: 12
+    INIT_VALUES: None
+    USE_APE: True
+    USE_RPB: False
+    USE_SHARED_RPB: False
+    USE_MEAN_POOLING: True
+    USE_INTERMEDIATE_LAYERS: True
+    PROJECTION_DIM: 1024
+    PROJECTION_LAYERS: 2
+    PATCH_PROJECTION: True
+    PATCH_PROJECTION_PER_FEATURE: True
+    INTERMEDIATE_LAYERS: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
+  FRE:
+    MASKING_RADIUS: 16
+    PROJECTOR_LAST_LAYER_ACTIVATION_TYPE: None
+    ORIGINAL_IMAGE_FEATURES_BRANCH: True
+  CLS_HEAD:
+    MLP_RATIO: 3
+  PATCH_VIT:
+    MINIMUM_PATCHES: 4
+DATA:
+  DATASET: csv_sid
+  IMG_SIZE: 224
+  NUM_WORKERS: 8
+  AUGMENTED_VIEWS: 4
+  TEST_PREFETCH_FACTOR: 1
+AUG:
+  COLOR_JITTER: 0.
+TRAIN:
+  EPOCHS: 35
+  WARMUP_EPOCHS: 5
+  BASE_LR: 5e-4
+  WARMUP_LR: 2.5e-7
+  MIN_LR: 2.5e-7
+  WEIGHT_DECAY: 0.05
+  LAYER_DECAY: 0.8
+  CLIP_GRAD: None
+  LOSS: "bce"
+TEST:
+  ORIGINAL_RESOLUTION: True
+PRINT_FREQ: 100
+SAVE_FREQ: 10

inference.py ADDED Viewed

	@@ -0,0 +1,208 @@

+from __future__ import annotations
+import argparse
+import base64
+import io
+import os
+from pathlib import Path
+from typing import Any
+import numpy as np
+import requests
+import torch
+from PIL import Image
+from spai.config import get_custom_config
+from spai.data.data_finetune import build_transform
+from spai.models import build_cls_model
+class EndpointHandler:
+    """Hugging Face Inference Endpoint handler for SPAI."""
+    def __init__(self, path: str = "") -> None:
+        self.model_dir = Path(path) if path else Path(".")
+        self.threshold = float(os.getenv("SPAI_THRESHOLD", "0.5"))
+        cfg_path = self._resolve_config_path()
+        self.config = get_custom_config(str(cfg_path))
+        self.device = self._resolve_device()
+        self.model = build_cls_model(self.config)
+        checkpoint_path = self._resolve_checkpoint_path()
+        state_dict = self._load_state_dict(checkpoint_path)
+        self.model.load_state_dict(state_dict, strict=False)
+        self.model.to(self.device)
+        self.model.eval()
+        self.transform = build_transform(is_train=False, config=self.config)
+    def __call__(self, data: dict[str, Any]) -> dict[str, Any] | list[dict[str, Any]]:
+        inputs = data.get("inputs", data.get("image", data))
+        if isinstance(inputs, list):
+            return [self._predict_one(item) for item in inputs]
+        return self._predict_one(inputs)
+    def _predict_one(self, raw_input: Any) -> dict[str, Any]:
+        image = self._load_image(raw_input)
+        image_np = np.array(image)
+        image_tensor = self.transform(image=image_np)["image"]
+        if self.config.MODEL.RESOLUTION_MODE == "arbitrary":
+            model_input = [image_tensor.unsqueeze(0).to(self.device)]
+            feature_batch_size = self.config.MODEL.FEATURE_EXTRACTION_BATCH
+            with torch.no_grad():
+                logits = self.model(model_input, feature_batch_size)
+        else:
+            model_input = image_tensor.unsqueeze(0).to(self.device)
+            with torch.no_grad():
+                logits = self.model(model_input)
+        score = float(torch.sigmoid(logits).flatten()[0].item())
+        predicted_label = int(score >= self.threshold)
+        return {
+            "score": score,
+            "predicted_label": predicted_label,
+            "predicted_label_name": "ai-generated" if predicted_label == 1 else "real",
+            "threshold": self.threshold,
+        }
+    def _resolve_config_path(self) -> Path:
+        env_cfg = os.getenv("SPAI_CONFIG")
+        if env_cfg:
+            cfg_path = Path(env_cfg)
+            if cfg_path.exists():
+                return cfg_path
+            raise FileNotFoundError(f"SPAI_CONFIG points to a missing file: {cfg_path}")
+        candidates = [
+            self.model_dir / "configs" / "spai.yaml",
+            self.model_dir / "spai.yaml",
+            self.model_dir / "config.yaml",
+        ]
+        for candidate in candidates:
+            if candidate.exists():
+                return candidate
+        raise FileNotFoundError(
+            "Could not locate model config. Expected one of: "
+            "configs/spai.yaml, spai.yaml, config.yaml, or SPAI_CONFIG env var."
+        )
+    def _resolve_checkpoint_path(self) -> Path:
+        env_ckpt = os.getenv("SPAI_CHECKPOINT")
+        if env_ckpt:
+            ckpt_path = Path(env_ckpt)
+            if ckpt_path.exists():
+                return ckpt_path
+            raise FileNotFoundError(f"SPAI_CHECKPOINT points to a missing file: {ckpt_path}")
+        candidates = [
+            self.model_dir / "spai.pth",
+            self.model_dir / "pytorch_model.bin",
+            self.model_dir / "weights" / "spai.pth",
+            self.model_dir / "spai" / "weights" / "spai.pth",
+        ]
+        for candidate in candidates:
+            if candidate.exists():
+                return candidate
+        pth_files = sorted(self.model_dir.glob("*.pth"))
+        if pth_files:
+            return pth_files[0]
+        raise FileNotFoundError(
+            "Could not locate model checkpoint. Expected one of: "
+            "spai.pth, pytorch_model.bin, weights/spai.pth, spai/weights/spai.pth, "
+            "or SPAI_CHECKPOINT env var."
+        )
+    @staticmethod
+    def _resolve_device() -> torch.device:
+        force_cpu = os.getenv("SPAI_FORCE_CPU", "0") == "1"
+        if (not force_cpu) and torch.cuda.is_available():
+            return torch.device("cuda")
+        return torch.device("cpu")
+    @staticmethod
+    def _load_state_dict(checkpoint_path: Path) -> dict[str, torch.Tensor]:
+        checkpoint = torch.load(checkpoint_path, map_location="cpu", weights_only=False)
+        if isinstance(checkpoint, dict) and "model" in checkpoint and isinstance(checkpoint["model"], dict):
+            return checkpoint["model"]
+        if isinstance(checkpoint, dict):
+            tensor_values = all(isinstance(v, torch.Tensor) for v in checkpoint.values())
+            if tensor_values:
+                return checkpoint
+        raise RuntimeError(
+            "Unsupported checkpoint format. Expected a dict with key 'model' or a raw state_dict."
+        )
+    def _load_image(self, raw_input: Any) -> Image.Image:
+        if isinstance(raw_input, Image.Image):
+            return raw_input.convert("RGB")
+        if isinstance(raw_input, bytes):
+            return Image.open(io.BytesIO(raw_input)).convert("RGB")
+        if isinstance(raw_input, dict):
+            if "bytes" in raw_input:
+                raw_bytes = raw_input["bytes"]
+                if isinstance(raw_bytes, str):
+                    raw_bytes = base64.b64decode(raw_bytes)
+                return Image.open(io.BytesIO(raw_bytes)).convert("RGB")
+            if "b64" in raw_input:
+                return Image.open(io.BytesIO(base64.b64decode(raw_input["b64"]))).convert("RGB")
+            if "url" in raw_input:
+                return self._load_image_from_url(raw_input["url"])
+            if "path" in raw_input:
+                return Image.open(Path(raw_input["path"])).convert("RGB")
+        if isinstance(raw_input, str):
+            if raw_input.startswith("http://") or raw_input.startswith("https://"):
+                return self._load_image_from_url(raw_input)
+            if raw_input.startswith("data:image") and "," in raw_input:
+                _, encoded = raw_input.split(",", 1)
+                return Image.open(io.BytesIO(base64.b64decode(encoded))).convert("RGB")
+            maybe_path = Path(raw_input)
+            if maybe_path.exists():
+                return Image.open(maybe_path).convert("RGB")
+            try:
+                decoded = base64.b64decode(raw_input, validate=True)
+                return Image.open(io.BytesIO(decoded)).convert("RGB")
+            except Exception as exc:
+                raise ValueError(
+                    "String input is neither a valid URL, file path, nor base64 image payload."
+                ) from exc
+        raise TypeError(
+            "Unsupported input type. Use a URL/path/base64 string, bytes, PIL.Image, "
+            "or dict with one of keys: bytes, b64, url, path."
+        )
+    @staticmethod
+    def _load_image_from_url(url: str) -> Image.Image:
+        response = requests.get(url, timeout=15)
+        response.raise_for_status()
+        return Image.open(io.BytesIO(response.content)).convert("RGB")
+def _main() -> None:
+    parser = argparse.ArgumentParser(description="Run SPAI inference for a single image.")
+    parser.add_argument("--image", type=str, required=True, help="Image path/URL/base64 input")
+    parser.add_argument("--model-dir", type=str, default=".", help="Directory with config/checkpoint")
+    args = parser.parse_args()
+    handler = EndpointHandler(path=args.model_dir)
+    result = handler({"inputs": args.image})
+    print(result)
+if __name__ == "__main__":
+    _main()

requirements.txt ADDED Viewed

	@@ -0,0 +1,37 @@

+# Install through conda
+# pytorch
+# torchvision~=0.18.1
+# tsnecuda
+# Compile from sources
+# apex
+# Install through pip
+opencv-python~=4.10.0.84
+pyyaml~=6.0.1
+scipy~=1.14.0
+tensorboard
+termcolor~=2.4.0
+timm==0.4.12
+yacs~=0.1.8
+numpy~=1.26.4
+torchmetrics~=1.4.0.post0
+tqdm~=4.66.4
+pillow~=10.4.0
+PyYAML
+click~=8.1.7
+neptune~=1.11.1
+albumentations==1.4.14
+albucore==0.0.16
+lmdb~=1.5.1
+networkx~=3.3
+seaborn~=0.13.2
+pandas~=2.2.2
+neptune
+einops~=0.8.0
+git+https://github.com/openai/CLIP.git
+onnx
+onnxscript
+huggingface_hub~=0.21.0
+datasets~=2.19.0
+requests~=2.32.3
+beautifulsoup4~=4.12.3
+imagehash~=4.3.1

spai/__init__.py ADDED Viewed

	@@ -0,0 +1,15 @@

+# SPDX-FileCopyrightText: Copyright (c) 2025 Centre for Research and Technology Hellas
+# and University of Amsterdam. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.

spai/__main__.py ADDED Viewed

	@@ -0,0 +1,208 @@

+# SPDX-FileCopyrightText: Copyright (c) 2025 Centre for Research and Technology Hellas
+# and University of Amsterdam. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import logging
+import os
+import pathlib
+import time
+import datetime
+from pathlib import Path
+from typing import Optional
+import numpy as np
+import neptune
+import cv2
+import click
+import torch
+import torch.backends.cudnn as cudnn
+import torch.utils.data
+import yacs
+import filetype
+from torch import nn
+from torch.nn import TripletMarginLoss
+from torch.utils.tensorboard import SummaryWriter
+from timm.utils import AverageMeter
+from yacs.config import CfgNode
+import spai.data.data_finetune
+from spai.config import get_config
+from spai.models import build_cls_model
+from spai.data import build_loader, build_loader_test
+from spai.lr_scheduler import build_scheduler
+from spai.models.sid import AttentionMask
+from spai.onnx import compare_pytorch_onnx_models
+from spai.optimizer import build_optimizer
+from spai.logger import create_logger
+from spai.utils import (
+    load_pretrained,
+    save_checkpoint,
+    get_grad_norm,
+    find_pretrained_checkpoints,
+    inf_nan_to_num
+)
+from spai.models import losses
+from spai import metrics
+from spai import data_utils
+def _cuda_enabled() -> bool:
+    # Allow forcing CPU mode and avoid probing CUDA on incompatible drivers.
+    if os.environ.get("SPAI_FORCE_CPU", "0") == "1":
+        return False
+    if os.environ.get("CUDA_VISIBLE_DEVICES", "") == "":
+        return False
+    try:
+        return torch.cuda.is_available()
+    except Exception:
+        return False
+try:
+    # noinspection PyUnresolvedReferences
+    from apex import amp
+except ImportError:
+    amp = None
+cv2.setNumThreads(1)
+logger: Optional[logging.Logger] = None
+@click.group()
+def cli() -> None:
+    pass
+@cli.command()
+@click.option("--cfg", required=True,
+              type=click.Path(exists=True, dir_okay=False, path_type=Path))
+@click.option("--batch-size", type=int,
+              help="Batch size for a single GPU.")
+@click.option("--learning-rate", type=float)
+@click.option("--data-path", required=True,
+              type=click.Path(exists=True, dir_okay=False, path_type=Path),
+              help="path to dataset")
+@click.option("--csv-root-dir",
+              type=click.Path(exists=True, file_okay=False, path_type=Path))
+@click.option("--lmdb", "lmdb_path",
+              type=click.Path(exists=True, dir_okay=False, path_type=Path),
+              help="Path to an LMDB file storage that contains the files defined in the "
+                   "dataset's CSV file. If this option is not provided, the data will be "
+                   "loaded from the filesystem.")
+@click.option("--pretrained",
+              type=click.Path(exists=True, dir_okay=False),
+              help="path to pre-trained model")
+@click.option("--resume", is_flag=True,
+              help="resume from checkpoint")
+@click.option("--accumulation-steps", type=int, default=1,
+              help="Gradient accumulation steps.")
+@click.option("--use-checkpoint", is_flag=True,
+              help="Whether to use gradient checkpointing to save memory.")
+@click.option("--amp-opt-level", type=click.Choice(["O0", "O1", "O2"]), default="O1",
+              help="mixed precision opt level, if O0, no amp is used")
+@click.option("--output", type=click.Path(file_okay=False, path_type=Path),
+              help="root of output folder, the full path is "
+                   "<output>/<model_name>/<tag> (default: output)")
+@click.option("--tag", type=str,
+              help="tag of experiment")
+@click.option("--local_rank", type=int, default=0,
+              help="local_rank for distributed training")
+@click.option("--test-csv", multiple=True,
+              type=click.Path(exists=True, dir_okay=False, path_type=Path),
+              help="Path to a CSV with test data. If this option is provided after the "
+                   "validation of each epoch, a testing will also take place. This option "
+                   "intends to facilitate understanding the progression of the generalization "
+                   "ability of a model among the epochs and should not be used for selecting "
+                   "the final model. This option can be repeated several times. For each provided "
+                   "csv file, a separate testing run is going to take place.")
+@click.option("--test-csv-root-dir", multiple=True,
+              type=click.Path(exists=True, file_okay=False, path_type=Path),
+              help="Root directory for the relative paths included into the test csv files. "
+                   "If this option is omitted, the parent directory of each test csv file will "
+                   "be used as the root dir for the paths it contains. If this option is provided "
+                   "a single time, it will be used as the root dir for all the test csv files. If "
+                   "it is provided multiple times, each value will be matched with a corresponding "
+                   "test csv file. In that case, the number of provided test csv files and the "
+                   "number of provided root directories should match. The order of the provided "
+                   "arguments will be used for the matching.")
+@click.option("--data-workers", type=int,
+              help="Number of worker processes to be used for data loading.")
+@click.option("--disable-pin-memory", is_flag=True)
+@click.option("--data-prefetch-factor", type=int)
+@click.option("--save-all", is_flag=True)
+@click.option("--opt", "extra_options", type=(str, str), multiple=True)
+def train(
+    cfg: Path,
+    batch_size: Optional[int],
+    learning_rate: Optional[float],
+    data_path: Path,
+    csv_root_dir: Optional[Path],
+    lmdb_path: Optional[Path],
+    pretrained: Optional[Path],
+    resume: bool,
+    accumulation_steps: int,
+    use_checkpoint: bool,
+    amp_opt_level: str,
+    output: Path,
+    tag: str,
+    local_rank: int,
+    test_csv: list[Path],
+    test_csv_root_dir: list[Path],
+    data_workers: Optional[int],
+    disable_pin_memory: bool,
+    data_prefetch_factor: Optional[int],
+    save_all: bool,
+    extra_options: tuple[str, str]
+) -> None:
+    if csv_root_dir is None:
+        csv_root_dir = data_path.parent
+    config = get_config({
+        "cfg": str(cfg),
+        "batch_size": batch_size,
+        "learning_rate": learning_rate,
+        "data_path": str(data_path),
+        "csv_root_dir": str(csv_root_dir),
+        "lmdb_path": str(lmdb_path),
+        "pretrained": str(pretrained) if pretrained is not None else None,
+        "resume": resume,
+        "accumulation_steps": accumulation_steps,
+        "use_checkpoint": use_checkpoint,
+        "amp_opt_level": amp_opt_level,
+        "output": str(output),
+        "tag": tag,
+        "local_rank": local_rank,
+        "test_csv": [str(p) for p in test_csv],
+        "test_csv_root": [str(p) for p in test_csv_root_dir],
+        "data_workers": data_workers,
+        "disable_pin_memory": disable_pin_memory,
+        "data_prefetch_factor": data_prefetch_factor,
+        "opts": extra_options
+    })
+    if 'LOCAL_RANK' not in os.environ:
+        os.environ['LOCAL_RANK'] = str(local_rank)
+    if config.AMP_OPT_LEVEL != "O0":
+        assert amp is not None, "amp not installed!"
+    # Set a fixed seed to all the random number generators.
+    seed = config.SEED
+    torch.manual_seed(seed)
+    np.random.seed(seed)
+    # random.seed(seed)
+    cudnn.benchmark = True
+    if config.TRAIN.SCALE_LR:
+        # Linear scale the learning rate according to total batch size - may not be optimal.
+        linear_scaled_lr = config.TRAIN.BASE_LR * config.DATA.BATCH_SIZ

spai/__pycache__/__init__.cpython-310.pyc ADDED Viewed

Binary file (164 Bytes). View file

spai/__pycache__/__init__.cpython-313.pyc ADDED Viewed

Binary file (144 Bytes). View file

spai/__pycache__/__main__.cpython-310.pyc ADDED Viewed

Binary file (28.9 kB). View file

spai/__pycache__/__main__.cpython-313.pyc ADDED Viewed

Binary file (9.31 kB). View file

spai/__pycache__/config.cpython-310.pyc ADDED Viewed

Binary file (8.44 kB). View file

spai/__pycache__/config.cpython-313.pyc ADDED Viewed

Binary file (20.1 kB). View file

spai/__pycache__/data_utils.cpython-310.pyc ADDED Viewed

Binary file (1.47 kB). View file

spai/__pycache__/data_utils.cpython-313.pyc ADDED Viewed

Binary file (2.15 kB). View file

spai/__pycache__/logger.cpython-310.pyc ADDED Viewed

Binary file (1.11 kB). View file

spai/__pycache__/logger.cpython-313.pyc ADDED Viewed

Binary file (1.93 kB). View file

spai/__pycache__/lr_scheduler.cpython-310.pyc ADDED Viewed

Binary file (5.14 kB). View file

spai/__pycache__/lr_scheduler.cpython-313.pyc ADDED Viewed

Binary file (7.59 kB). View file

spai/__pycache__/metrics.cpython-310.pyc ADDED Viewed

Binary file (5.75 kB). View file

spai/__pycache__/metrics.cpython-313.pyc ADDED Viewed

Binary file (10.9 kB). View file

spai/__pycache__/onnx.cpython-310.pyc ADDED Viewed

Binary file (3.98 kB). View file

spai/__pycache__/onnx.cpython-313.pyc ADDED Viewed

Binary file (7.25 kB). View file

spai/__pycache__/optimizer.cpython-310.pyc ADDED Viewed

Binary file (4.6 kB). View file

spai/__pycache__/optimizer.cpython-313.pyc ADDED Viewed

Binary file (8.91 kB). View file

spai/__pycache__/utils.cpython-310.pyc ADDED Viewed

Binary file (14.1 kB). View file

spai/__pycache__/utils.cpython-313.pyc ADDED Viewed

Binary file (25.4 kB). View file

spai/config.py ADDED Viewed

	@@ -0,0 +1,494 @@

+# SPDX-FileCopyrightText: Copyright (c) 2025 Centre for Research and Technology Hellas
+# and University of Amsterdam. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import os
+from typing import Optional, Any
+import yaml
+from yacs.config import CfgNode as CN
+_C = CN()
+# Base config files
+_C.BASE = ['']
+# -----------------------------------------------------------------------------
+# Data settings
+# -----------------------------------------------------------------------------
+_C.DATA = CN()
+# Batch size for a single GPU, could be overwritten by command line argument
+_C.DATA.BATCH_SIZE = 128
+# Batch size for validation. If it is set to None, DATA.BATCH_SIZE will be used.
+_C.DATA.VAL_BATCH_SIZE = None
+# Batch size for test. If it is set to None, DATA.BATCH_SIZE will be used.
+_C.DATA.TEST_BATCH_SIZE = None
+# Path to dataset, could be overwritten by command line argument
+_C.DATA.DATA_PATH = ''
+# Root path for the relative paths included in a dataset csv file. Not-used when
+# the DATA.DATA_PATH does not point to a csv file.
+_C.DATA.CSV_ROOT = ''
+# A list of paths to the test datasets. Can be overwritten by command line argument.
+_C.DATA.TEST_DATA_PATH = []
+# A list of paths that will be used as root directories for the paths in the test csv files.
+_C.DATA.TEST_DATA_CSV_ROOT = []
+# Path to an LMDB filestorage. When this option is not None, the dataset's file are loaded
+# from this one, instead of the filesystem.
+_C.DATA.LMDB_PATH = None
+# Dataset name
+_C.DATA.DATASET = 'imagenet'
+# Input image size
+_C.DATA.IMG_SIZE = 224
+# Minimal crop scale
+_C.DATA.MIN_CROP_SCALE = 0.2
+# Interpolation to resize image (random, bilinear, bicubic)
+_C.DATA.INTERPOLATION = 'bicubic'
+# Pin CPU memory in DataLoader for more efficient (sometimes) transfer to GPU.
+_C.DATA.PIN_MEMORY = True
+# Number of data loading threads
+_C.DATA.NUM_WORKERS = 24
+# Number of batches to be prefetched by each worker.
+_C.DATA.PREFETCH_FACTOR = 2
+# Prefetch factor for test data loaders.
+_C.DATA.VAL_PREFETCH_FACTOR = None
+# Prefetch factor for test data loaders.
+_C.DATA.TEST_PREFETCH_FACTOR = None
+# Filter type, support 'mfm', 'sr', 'deblur', 'denoise'
+_C.DATA.FILTER_TYPE = 'mfm'
+# [MFM] Sampling ratio for low-pass filters
+_C.DATA.SAMPLE_RATIO = 0.5
+# [MFM] First frequency mask radius
+# should be smaller than half of the image size
+_C.DATA.MASK_RADIUS1 = 16
+# [MFM] Second frequency mask radius
+# should be larger than the first radius
+# only used when masking a frequency band
+# setting a larger value than the image size, e.g., 999, will have no effect
+_C.DATA.MASK_RADIUS2 = 999
+# [SR] SR downsampling scale factor, only used when FILTER_TYPE == 'sr'
+_C.DATA.SR_FACTOR = 8
+# [Deblur] Deblur parameters, only used when FILTER_TYPE == 'deblur'
+_C.DATA.BLUR = CN()
+_C.DATA.BLUR.KERNEL_SIZE = [7, 9, 11, 13, 15, 17, 19, 21]
+_C.DATA.BLUR.KERNEL_LIST = ['iso', 'aniso', 'generalized_iso', 'generalized_aniso', 'plateau_iso', 'plateau_aniso', 'sinc']
+_C.DATA.BLUR.KERNEL_PROB = [0.405, 0.225, 0.108, 0.027, 0.108, 0.027, 0.1]
+_C.DATA.BLUR.SIGMA_X = [0.2, 3]
+_C.DATA.BLUR.SIGMA_Y = [0.2, 3]
+_C.DATA.BLUR.ROTATE_ANGLE = [-3.1416, 3.1416]
+_C.DATA.BLUR.BETA_GAUSSIAN = [0.5, 4]
+_C.DATA.BLUR.BETA_PLATEAU = [1, 2]
+# [Denoise] Denoise parameters, only used when FILTER_TYPE == 'denoise'
+_C.DATA.NOISE = CN()
+_C.DATA.NOISE.TYPE = ['gaussian', 'poisson']
+_C.DATA.NOISE.PROB = [0.5, 0.5]
+_C.DATA.NOISE.GAUSSIAN_SIGMA = [1, 30]
+_C.DATA.NOISE.GAUSSIAN_GRAY_NOISE_PROB = 0.4
+_C.DATA.NOISE.POISSON_SCALE = [0.05, 3]
+_C.DATA.NOISE.POISSON_GRAY_NOISE_PROB = 0.4
+# Number of augmented views for each batch. When SupCon loss is employed, this number
+# should be at least 2.
+_C.DATA.AUGMENTED_VIEWS = 1
+# -----------------------------------------------------------------------------
+# Model settings
+# -----------------------------------------------------------------------------
+_C.MODEL = CN()
+# Model type
+_C.MODEL.TYPE = 'vit'
+# Type of weights that will be used to initialize the backbone. Supported "mfm", "clip", "dinov2".
+_C.MODEL_WEIGHTS = "mfm"
+# Model name
+_C.MODEL.NAME = 'pretrain'
+# Checkpoint to resume, could be overwritten by command line argument
+_C.MODEL.RESUME = ''
+# Number of classes, overwritten in data preparation
+_C.MODEL.NUM_CLASSES = 1000
+# Dropout rate for the backbone model.
+_C.MODEL.DROP_RATE = 0.0
+# Dropout rate for the trainable SID layers.
+_C.MODEL.SID_DROPOUT = 0.5
+# Drop path rate
+_C.MODEL.DROP_PATH_RATE = 0.1
+# Label Smoothing
+_C.MODEL.LABEL_SMOOTHING = 0.1
+# Required normalization to be applied to the image before provided to the model.
+_C.MODEL.REQUIRED_NORMALIZATION = "imagenet"
+# Approach used for the Synthetic Image Detection task. "single_extraction" and "freq_restoration"
+# are currently supported.
+_C.MODEL.SID_APPROACH = "single_extraction"
+# Whether the model accepts a fixed resolution image to its input or an arbitrary resolution image.
+# Supported values are "fixed" and "arbitrary"
+_C.MODEL.RESOLUTION_MODE = "fixed"
+# Batch size used internally by patched models for feature extraction. If not provided,
+# it is determined by the batch size of the input.
+_C.MODEL.FEATURE_EXTRACTION_BATCH = None
+# Swin Transformer parameters
+_C.MODEL.SWIN = CN()
+_C.MODEL.SWIN.PATCH_SIZE = 4
+_C.MODEL.SWIN.IN_CHANS = 3
+_C.MODEL.SWIN.EMBED_DIM = 96
+_C.MODEL.SWIN.DEPTHS = [2, 2, 6, 2]
+_C.MODEL.SWIN.NUM_HEADS = [3, 6, 12, 24]
+_C.MODEL.SWIN.WINDOW_SIZE = 7
+_C.MODEL.SWIN.MLP_RATIO = 4.
+_C.MODEL.SWIN.QKV_BIAS = True
+_C.MODEL.SWIN.QK_SCALE = None
+_C.MODEL.SWIN.APE = False
+_C.MODEL.SWIN.PATCH_NORM = True
+# Vision Transformer parameters
+_C.MODEL.VIT = CN()
+_C.MODEL.VIT.PATCH_SIZE = 16
+_C.MODEL.VIT.IN_CHANS = 3
+_C.MODEL.VIT.EMBED_DIM = 768
+_C.MODEL.VIT.DEPTH = 12
+_C.MODEL.VIT.NUM_HEADS = 12
+_C.MODEL.VIT.MLP_RATIO = 4
+_C.MODEL.VIT.QKV_BIAS = True
+_C.MODEL.VIT.INIT_VALUES = 0.1
+# learnable absolute positional embedding
+_C.MODEL.VIT.USE_APE = True
+# fixed sin-cos positional embedding
+_C.MODEL.VIT.USE_FPE = False
+# relative position bias
+_C.MODEL.VIT.USE_RPB = False
+_C.MODEL.VIT.USE_SHARED_RPB = False
+_C.MODEL.VIT.USE_MEAN_POOLING = False
+# Vision Transformer decoder parameters
+_C.MODEL.VIT.DECODER = CN()
+_C.MODEL.VIT.DECODER.EMBED_DIM = 512
+_C.MODEL.VIT.DECODER.DEPTH = 0
+_C.MODEL.VIT.DECODER.NUM_HEADS = 16
+# Features processor parameter
+# Supported features processors: "mean_norm", "norm_max", "rine"
+_C.MODEL.VIT.FEATURES_PROCESSOR = "rine"
+_C.MODEL.VIT.USE_INTERMEDIATE_LAYERS = False
+_C.MODEL.VIT.INTERMEDIATE_LAYERS = [2, 5, 8, 11]
+_C.MODEL.VIT.PROJECTION_DIM = 1024
+_C.MODEL.VIT.PROJECTION_LAYERS = 2
+_C.MODEL.VIT.PATCH_PROJECTION = False
+_C.MODEL.VIT.PATCH_PROJECTION_PER_FEATURE = False
+# Supported patch pooling: "mean", "l2_max"
+_C.MODEL.VIT.PATCH_POOLING = "mean"
+# Frequency Restoration Estimator parameters
+_C.MODEL.FRE = CN()
+_C.MODEL.FRE.MASKING_RADIUS = 16
+_C.MODEL.FRE.PROJECTOR_LAST_LAYER_ACTIVATION_TYPE = "gelu"
+_C.MODEL.FRE.ORIGINAL_IMAGE_FEATURES_BRANCH = False
+_C.MODEL.FRE.DISABLE_RECONSTRUCTION_SIMILARITY = False
+# PatchBasedMFViT related parameters
+_C.MODEL.PATCH_VIT = CN()
+_C.MODEL.PATCH_VIT.PATCH_STRIDE = 224
+_C.MODEL.PATCH_VIT.NUM_HEADS = 12
+_C.MODEL.PATCH_VIT.ATTN_EMBED_DIM = 1536
+_C.MODEL.PATCH_VIT.MINIMUM_PATCHES = 1
+# Classification head parameters
+_C.MODEL.CLS_HEAD = CN()
+_C.MODEL.CLS_HEAD.MLP_RATIO = 4
+# ResNet parameters
+_C.MODEL.RESNET = CN()
+_C.MODEL.RESNET.LAYERS = [3, 4, 6, 3]
+_C.MODEL.RESNET.IN_CHANS = 3
+# [MFM] Reconstruction target type, support 'normal', 'masked'
+_C.MODEL.RECOVER_TARGET_TYPE = 'normal'
+# [MFM] Frequency loss parameters
+_C.MODEL.FREQ_LOSS = CN()
+_C.MODEL.FREQ_LOSS.LOSS_GAMMA = 1.
+_C.MODEL.FREQ_LOSS.MATRIX_GAMMA = 1.
+_C.MODEL.FREQ_LOSS.PATCH_FACTOR = 1
+_C.MODEL.FREQ_LOSS.AVE_SPECTRUM = False
+_C.MODEL.FREQ_LOSS.WITH_MATRIX = False
+_C.MODEL.FREQ_LOSS.LOG_MATRIX = False
+_C.MODEL.FREQ_LOSS.BATCH_MATRIX = False
+# -----------------------------------------------------------------------------
+# Training settings
+# -----------------------------------------------------------------------------
+_C.TRAIN = CN()
+_C.TRAIN.START_EPOCH = 0
+_C.TRAIN.EPOCHS = 300
+_C.TRAIN.WARMUP_EPOCHS = 20
+_C.TRAIN.WEIGHT_DECAY = 0.05
+_C.TRAIN.BASE_LR = 3e-4
+_C.TRAIN.WARMUP_LR = 2.5e-7
+_C.TRAIN.MIN_LR = 2.5e-6
+# Clip gradient norm
+_C.TRAIN.CLIP_GRAD = 3.0
+# Auto resume from latest checkpoint
+_C.TRAIN.AUTO_RESUME = True
+# Gradient accumulation steps
+# could be overwritten by command line argument
+_C.TRAIN.ACCUMULATION_STEPS = 1
+# Whether to use gradient checkpointing to save memory
+# could be overwritten by command line argument
+_C.TRAIN.USE_CHECKPOINT = False
+# LR scheduler
+# Supported modes: "supervised", "contrastive"
+_C.TRAIN.MODE = "supervised"
+_C.TRAIN.LR_SCHEDULER = CN()
+_C.TRAIN.LR_SCHEDULER.NAME = 'cosine'
+# Epoch interval to decay LR, used in StepLRScheduler
+_C.TRAIN.LR_SCHEDULER.DECAY_EPOCHS = 30
+# LR decay rate, used in StepLRScheduler
+_C.TRAIN.LR_SCHEDULER.DECAY_RATE = 0.1
+# Gamma / Multi steps value, used in MultiStepLRScheduler
+_C.TRAIN.LR_SCHEDULER.GAMMA = 0.1
+_C.TRAIN.LR_SCHEDULER.MULTISTEPS = []
+# A flag that indicates whether to scale lr according to batch size and grad accumulation steps.
+_C.TRAIN.SCALE_LR = False
+# Optimizer
+_C.TRAIN.OPTIMIZER = CN()
+_C.TRAIN.OPTIMIZER.NAME = 'adamw'
+# Optimizer Epsilon
+_C.TRAIN.OPTIMIZER.EPS = 1e-8
+# Optimizer Betas
+_C.TRAIN.OPTIMIZER.BETAS = (0.9, 0.999)
+# SGD momentum
+_C.TRAIN.OPTIMIZER.MOMENTUM = 0.9
+_C.TRAIN.LOSS = "bce_supcont"
+_C.TRAIN.TRIPLET_LOSS_MARGIN = 0.5
+# Layer decay for fine-tuning
+_C.TRAIN.LAYER_DECAY = 1.0
+# -----------------------------------------------------------------------------
+# Augmentation settings
+# -----------------------------------------------------------------------------
+_C.AUG = CN()
+# Crop augmentation
+_C.AUG.MIN_CROP_AREA = 0.2
+_C.AUG.MAX_CROP_AREA = 1.0
+# Flip augmentation
+_C.AUG.HORIZONTAL_FLIP_PROB = 0.5
+_C.AUG.VERTICAL_FLIP_PROB = 0.5
+# Rotation augmentation
+_C.AUG.ROTATION_PROB = 0.5
+_C.AUG.ROTATION_DEGREES = 90
+# Gaussian blur augmentation
+_C.AUG.GAUSSIAN_BLUR_PROB = 0.5
+_C.AUG.GAUSSIAN_BLUR_LIMIT = (3, 9)
+_C.AUG.GAUSSIAN_BLUR_SIGMA = (0.01, 0.5)
+# Gaussian noise augmentation
+_C.AUG.GAUSSIAN_NOISE_PROB = 0.5
+# JPEG compression augmentation
+_C.AUG.JPEG_COMPRESSION_PROB = 0.5
+_C.AUG.JPEG_MIN_QUALITY = 50
+_C.AUG.JPEG_MAX_QUALITY = 100
+# WEBP compression augmentation
+_C.AUG.WEBP_COMPRESSION_PROB = .0
+_C.AUG.WEBP_MIN_QUALITY = 50
+_C.AUG.WEBP_MAX_QUALITY = 100
+# Color jitter augmentation
+_C.AUG.COLOR_JITTER = .0
+_C.AUG.COLOR_JITTER_BRIGHTNESS_RANGE = (0.8, 1.2)
+_C.AUG.COLOR_JITTER_CONTRAST_RANGE = (0.8, 1.2)
+_C.AUG.COLOR_JITTER_SATURATION_RANGE = (0.8, 1.2)
+_C.AUG.COLOR_JITTER_HUE_RANGE = (-0.1, 0.1)
+# Sharpen augmentation
+_C.AUG.SHARPEN_PROB = .0
+_C.AUG.SHARPEN_ALPHA_RANGE = (0.01, 0.4)
+_C.AUG.SHARPEN_LIGHTNESS_RANGE = (0.95, 1)
+# Use AutoAugment policy. "v0" or "original"
+_C.AUG.AUTO_AUGMENT = 'rand-m9-mstd0.5-inc1'
+# Random erase prob
+_C.AUG.REPROB = 0.25
+# Random erase mode
+_C.AUG.REMODE = 'pixel'
+# Random erase count
+_C.AUG.RECOUNT = 1
+# Probability of applying blurring
+_C.AUG.BLUR_PROB = 0.25
+# Mixup alpha, mixup enabled if > 0
+_C.AUG.MIXUP = 0.8
+# Cutmix alpha, cutmix enabled if > 0
+_C.AUG.CUTMIX = 1.0
+# Cutmix min/max ratio, overrides alpha and enables cutmix if set
+_C.AUG.CUTMIX_MINMAX = None
+# Probability of performing mixup or cutmix when either/both is enabled
+_C.AUG.MIXUP_PROB = 1.0
+# Probability of switching to cutmix when both mixup and cutmix enabled
+_C.AUG.MIXUP_SWITCH_PROB = 0.5
+# How to apply mixup/cutmix params. Per "batch", "pair", or "elem"
+_C.AUG.MIXUP_MODE = 'batch'
+# -----------------------------------------------------------------------------
+# Testing settings
+# -----------------------------------------------------------------------------
+_C.TEST = CN()
+# Whether to use center crop when testing.
+_C.TEST.CROP = True
+# Size for resizing images during testing.
+_C.TEST.MAX_SIZE: Optional[int] = None
+# When this option is set to True, the original resolution image is provided to the model.
+# Setting this option to True automatically sets the batch size for validation/testing to 1.
+_C.TEST.ORIGINAL_RESOLUTION = False
+# Approach that will be used for generating different views of an image during testing.
+# Currently, "tencrop" and None are supported.
+_C.TEST.VIEWS_GENERATION_APPROACH = None
+# Approach that will be used to combine the scores predicted for multiple views of the same
+# image. This value is meaningful only when a view generation approach is used.
+# Currently, "mean" and "max" are supported.
+_C.TEST.VIEWS_REDUCTION_APPROACH = "mean"
+# A flag that when set to True exports the analysis of the Spectral Context Attention.
+_C.TEST.EXPORT_IMAGE_PATCHES = False
+# -----------------------------------------------------------------------------
+# Setting for Test-Time Perturbations
+# -----------------------------------------------------------------------------
+# Gaussian blur perturbation.
+_C.TEST.GAUSSIAN_BLUR = False
+_C.TEST.GAUSSIAN_BLUR_KERNEL_SIZE = 3
+# Gaussian noise perturbation.
+_C.TEST.GAUSSIAN_NOISE = False
+_C.TEST.GAUSSIAN_NOISE_SIGMA = 1.0
+# JPEG compression perturbation.
+_C.TEST.JPEG_COMPRESSION = False
+_C.TEST.JPEG_QUALITY = 100
+# WEBP compression augmentation.
+_C.TEST.WEBP_COMPRESSION = False
+_C.TEST.WEBP_QUALITY = 100
+# Scale perturbation.
+_C.TEST.SCALE = False
+_C.TEST.SCALE_FACTOR = 1.0
+# -----------------------------------------------------------------------------
+# Misc
+# -----------------------------------------------------------------------------
+# Mixed precision opt level, if O0, no amp is used ('O0', 'O1', 'O2')
+# overwritten by command line argument
+_C.AMP_OPT_LEVEL = ''
+# Path to output folder, overwritten by command line argument
+_C.OUTPUT = ''
+# Tag of experiment, overwritten by command line argument
+_C.TAG = 'default'
+# Frequency to save checkpoint
+_C.SAVE_FREQ = 10
+# Frequency to logging info
+_C.PRINT_FREQ = 10
+# Fixed random seed
+_C.SEED = 0
+# Perform evaluation only, overwritten by command line argument
+_C.EVAL_MODE = False
+# Test throughput only, overwritten by command line argument
+_C.THROUGHPUT_MODE = False
+# Local rank for DistributedDataParallel, given by command line argument
+_C.LOCAL_RANK = 0
+# Path to pre-trained model
+_C.PRETRAINED = ''
+def _update_config_from_file(config, cfg_file):
+    config.defrost()
+    with open(cfg_file, 'r') as f:
+        yaml_cfg = yaml.load(f, Loader=yaml.FullLoader)
+    for cfg in yaml_cfg.setdefault('BASE', ['']):
+        if cfg:
+            _update_config_from_file(
+                config, os.path.join(os.path.dirname(cfg_file), cfg)
+            )
+    print('=> merge config from {}'.format(cfg_file))
+    config.merge_from_file(cfg_file)
+    config.freeze()
+def update_config(config, args):
+    _update_config_from_file(config, args["cfg"])
+    config.defrost()
+    if "opts" in args:
+        options: list[Any] = []
+        for (k, v) in args["opts"]:
+            options.append(k)
+            options.append(eval(v))
+        config.merge_from_list(options)
+    def _check_args(name):
+        if name in args and args[name]:
+            return True
+        return False
+    # merge from specific arguments
+    if _check_args('batch_size'):
+        config.DATA.BATCH_SIZE = args["batch_size"]
+    if _check_args('data_path'):
+        config.DATA.DATA_PATH = args["data_path"]
+    if _check_args('csv_root_dir'):
+        config.DATA.CSV_ROOT = args["csv_root_dir"]
+    if _check_args("lmdb_path"):
+        config.DATA.LMDB_PATH = args["lmdb_path"]
+    if _check_args('resume'):
+        config.MODEL.RESUME = args["resume"]
+    if _check_args('pretrained'):
+        config.PRETRAINED = args["pretrained"]
+    if _check_args('accumulation_steps'):
+        config.TRAIN.ACCUMULATION_STEPS = args["accumulation_steps"]
+    if _check_args('use_checkpoint'):
+        config.TRAIN.USE_CHECKPOINT = True
+    if _check_args('amp_opt_level'):
+        config.AMP_OPT_LEVEL = args["amp_opt_level"]
+    if _check_args('output'):
+        config.OUTPUT = args["output"]
+    if _check_args('tag'):
+        config.TAG = args["tag"]
+    if _check_args('eval'):
+        config.EVAL_MODE = True
+    if _check_args('throughput'):
+        config.THROUGHPUT_MODE = True
+    if _check_args('test_csv'):
+        config.DATA.TEST_DATA_PATH = args["test_csv"]
+    if _check_args('test_csv_root'):
+        config.DATA.TEST_DATA_CSV_ROOT = args["test_csv_root"]
+    if _check_args('learning_rate'):
+        config.TRAIN.BASE_LR = args["learning_rate"]
+    if _check_args('resize_to'):
+        config.TEST.MAX_SIZE = args["resize_to"]
+    if _check_args("local_rank"):
+        # set local rank for distributed training
+        config.LOCAL_RANK = args["local_rank"]
+    if _check_args("data_workers"):
+        config.DATA.NUM_WORKERS = args["data_workers"]
+    if _check_args("disable_pin_memory"):
+        config.PIN_MEMORY = False
+    if _check_args("data_prefetch_factor"):
+        config.DATA.PREFETCH_FACTOR = args["data_prefetch_factor"]
+    # output folder
+    config.OUTPUT = os.path.join(config.OUTPUT, config.MODEL.NAME, config.TAG)
+    config.freeze()
+def get_config(args):
+    """Get a yacs CfgNode object with default values."""
+    # Return a clone so that the defaults will not be altered
+    # This is for the "local variable" use pattern
+    config = _C.clone()
+    update_config(config, args)
+    return config
+def get_custom_config(cfg):
+    config = _C.clone()
+    _update_config_from_file(config, cfg)
+    return config

spai/data/__init__.py ADDED Viewed

	@@ -0,0 +1,26 @@

+# SPDX-FileCopyrightText: Copyright (c) 2025 Centre for Research and Technology Hellas
+# and University of Amsterdam. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from .data_mfm import build_loader_mfm
+from .data_finetune import build_loader_finetune, build_loader_test
+def build_loader(config, logger, is_pretrain, is_test):
+    if is_pretrain:
+        return build_loader_mfm(config, logger)
+    elif is_test:
+        return build_loader_test(config, logger)
+    else:
+        return build_loader_finetune(config, logger)

spai/data/__pycache__/__init__.cpython-310.pyc ADDED Viewed

Binary file (501 Bytes). View file

spai/data/__pycache__/__init__.cpython-313.pyc ADDED Viewed

Binary file (577 Bytes). View file

spai/data/__pycache__/blur_kernels.cpython-310.pyc ADDED Viewed

Binary file (14.5 kB). View file

spai/data/__pycache__/blur_kernels.cpython-313.pyc ADDED Viewed

Binary file (20.2 kB). View file

spai/data/__pycache__/data_finetune.cpython-310.pyc ADDED Viewed

Binary file (19.4 kB). View file

spai/data/__pycache__/data_finetune.cpython-313.pyc ADDED Viewed

Binary file (38.6 kB). View file

spai/data/__pycache__/data_mfm.cpython-310.pyc ADDED Viewed

Binary file (4.89 kB). View file

spai/data/__pycache__/data_mfm.cpython-313.pyc ADDED Viewed

Binary file (9.35 kB). View file

spai/data/__pycache__/filestorage.cpython-310.pyc ADDED Viewed

Binary file (10.8 kB). View file

spai/data/__pycache__/filestorage.cpython-313.pyc ADDED Viewed

Binary file (17.5 kB). View file

spai/data/__pycache__/random_degradations.cpython-310.pyc ADDED Viewed

Binary file (12.2 kB). View file

spai/data/__pycache__/random_degradations.cpython-313.pyc ADDED Viewed

Binary file (20.9 kB). View file

spai/data/__pycache__/readers.cpython-310.pyc ADDED Viewed

Binary file (6.41 kB). View file

spai/data/__pycache__/readers.cpython-313.pyc ADDED Viewed

Binary file (9.32 kB). View file

spai/data/blur_kernels.py ADDED Viewed

	@@ -0,0 +1,539 @@

+# This code is referenced from BasicSR with modifications.
+# Reference: https://github.com/xinntao/BasicSR/blob/master/basicsr/data/degradations.py  # noqa
+# Original licence: Copyright (c) 2020 xinntao, under the Apache 2.0 license.
+import random
+import numpy as np
+import torch
+from scipy import special
+def get_rotated_sigma_matrix(sig_x, sig_y, theta):
+    """Calculate the rotated sigma matrix (two dimensional matrix).
+    Args:
+        sig_x (float): Standard deviation along the horizontal direction.
+        sig_y (float): Standard deviation along the vertical direction.
+        theta (float): Rotation in radian.
+    Returns:
+        ndarray: Rotated sigma matrix.
+    """
+    diag = np.array([[sig_x**2, 0], [0, sig_y**2]]).astype(np.float32)
+    rot = np.array([[np.cos(theta), -np.sin(theta)],
+                    [np.sin(theta), np.cos(theta)]]).astype(np.float32)
+    return np.matmul(rot, np.matmul(diag, rot.T))
+def _mesh_grid(kernel_size):
+    """Generate the mesh grid, centering at zero.
+    Args:
+        kernel_size (int): The size of the kernel.
+    Returns:
+        x_grid (ndarray): x-coordinates with shape (kernel_size, kernel_size).
+        y_grid (ndarray): y-coordiantes with shape (kernel_size, kernel_size).
+        xy_grid (ndarray): stacked coordinates with shape
+            (kernel_size, kernel_size, 2).
+    """
+    range_ = np.arange(-kernel_size // 2 + 1., kernel_size // 2 + 1.)
+    x_grid, y_grid = np.meshgrid(range_, range_)
+    xy_grid = np.hstack((x_grid.reshape((kernel_size * kernel_size, 1)),
+                         y_grid.reshape(kernel_size * kernel_size,
+                                        1))).reshape(kernel_size, kernel_size,
+                                                     2)
+    return xy_grid, x_grid, y_grid
+def calculate_gaussian_pdf(sigma_matrix, grid):
+    """Calculate PDF of the bivariate Gaussian distribution.
+    Args:
+        sigma_matrix (ndarray): The variance matrix with shape (2, 2).
+        grid (ndarray): Coordinates generated by :func:`_mesh_grid`,
+            with shape (K, K, 2), where K is the kernel size.
+    Returns:
+        kernel (ndarrray): Un-normalized kernel.
+    """
+    inverse_sigma = np.linalg.inv(sigma_matrix)
+    kernel = np.exp(-0.5 * np.sum(np.matmul(grid, inverse_sigma) * grid, 2))
+    return kernel
+def bivariate_gaussian(kernel_size,
+                       sig_x,
+                       sig_y=None,
+                       theta=None,
+                       grid=None,
+                       is_isotropic=True):
+    """Generate a bivariate isotropic or anisotropic Gaussian kernel.
+    In isotropic mode, only `sig_x` is used. `sig_y` and `theta` are
+    ignored.
+    Args:
+        kernel_size (int): The size of the kernel
+        sig_x (float): Standard deviation along horizontal direction.
+        sig_y (float | None, optional): Standard deviation along the vertical
+            direction. If it is None, 'is_isotropic' must be set to True.
+            Default: None.
+        theta (float | None, optional): Rotation in radian. If it is None,
+            'is_isotropic' must be set to True. Default: None.
+        grid (ndarray, optional): Coordinates generated by :func:`_mesh_grid`,
+            with shape (K, K, 2), where K is the kernel size. Default: None
+        is_isotropic (bool, optional): Whether to use an isotropic kernel.
+            Default: True.
+    Returns:
+        kernel (ndarray): normalized kernel (i.e. sum to 1).
+    """
+    if grid is None:
+        grid, _, _ = _mesh_grid(kernel_size)
+    if is_isotropic:
+        sigma_matrix = np.array([[sig_x**2, 0], [0,
+                                                 sig_x**2]]).astype(np.float32)
+    else:
+        if sig_y is None:
+            raise ValueError('"sig_y" cannot be None if "is_isotropic" is '
+                             'False.')
+        sigma_matrix = get_rotated_sigma_matrix(sig_x, sig_y, theta)
+    kernel = calculate_gaussian_pdf(sigma_matrix, grid)
+    kernel = kernel / np.sum(kernel)
+    return kernel
+def bivariate_generalized_gaussian(kernel_size,
+                                   sig_x,
+                                   sig_y=None,
+                                   theta=None,
+                                   beta=1,
+                                   grid=None,
+                                   is_isotropic=True):
+    """Generate a bivariate generalized Gaussian kernel.
+    Described in `Parameter Estimation For Multivariate Generalized
+    Gaussian Distributions` by Pascal et. al (2013). In isotropic mode,
+    only `sig_x` is used. `sig_y` and `theta` is ignored.
+    Args:
+        kernel_size (int): The size of the kernel
+        sig_x (float): Standard deviation along horizontal direction
+        sig_y (float | None, optional): Standard deviation along the vertical
+            direction. If it is None, 'is_isotropic' must be set to True.
+            Default: None.
+        theta (float | None, optional): Rotation in radian. If it is None,
+            'is_isotropic' must be set to True. Default: None.
+        beta (float, optional): Shape parameter, beta = 1 is the normal
+            distribution. Default: 1.
+        grid (ndarray, optional): Coordinates generated by :func:`_mesh_grid`,
+            with shape (K, K, 2), where K is the kernel size. Default: None
+        is_isotropic (bool, optional): Whether to use an isotropic kernel.
+            Default: True.
+    Returns:
+        kernel (ndarray): normalized kernel.
+    """
+    if grid is None:
+        grid, _, _ = _mesh_grid(kernel_size)
+    if is_isotropic:
+        sigma_matrix = np.array([[sig_x**2, 0], [0,
+                                                 sig_x**2]]).astype(np.float32)
+    else:
+        sigma_matrix = get_rotated_sigma_matrix(sig_x, sig_y, theta)
+    inverse_sigma = np.linalg.inv(sigma_matrix)
+    kernel = np.exp(
+        -0.5 *
+        np.power(np.sum(np.matmul(grid, inverse_sigma) * grid, 2), beta))
+    kernel = kernel / np.sum(kernel)
+    return kernel
+def bivariate_plateau(kernel_size,
+                      sig_x,
+                      sig_y,
+                      theta,
+                      beta,
+                      grid=None,
+                      is_isotropic=True):
+    """Generate a plateau-like anisotropic kernel.
+    This kernel has a form of 1 / (1+x^(beta)).
+    Ref: https://stats.stackexchange.com/questions/203629/is-there-a-plateau-shaped-distribution  # noqa
+    In the isotropic mode, only `sig_x` is used. `sig_y` and `theta` is ignored.
+    Args:
+        kernel_size (int): The size of the kernel
+        sig_x (float): Standard deviation along horizontal direction
+        sig_y (float): Standard deviation along the vertical direction.
+        theta (float): Rotation in radian.
+        beta (float): Shape parameter, beta = 1 is the normal distribution.
+        grid (ndarray, optional): Coordinates generated by :func:`_mesh_grid`,
+            with shape (K, K, 2), where K is the kernel size. Default: None
+        is_isotropic (bool, optional): Whether to use an isotropic kernel.
+            Default: True.
+    Returns:
+        kernel (ndarray): normalized kernel (i.e. sum to 1).
+    """
+    if grid is None:
+        grid, _, _ = _mesh_grid(kernel_size)
+    if is_isotropic:
+        sigma_matrix = np.array([[sig_x**2, 0], [0,
+                                                 sig_x**2]]).astype(np.float32)
+    else:
+        sigma_matrix = get_rotated_sigma_matrix(sig_x, sig_y, theta)
+    inverse_sigma = np.linalg.inv(sigma_matrix)
+    kernel = np.reciprocal(
+        np.power(np.sum(np.matmul(grid, inverse_sigma) * grid, 2), beta) + 1)
+    kernel = kernel / np.sum(kernel)
+    return kernel
+def random_bivariate_gaussian_kernel(kernel_size,
+                                     sigma_x_range,
+                                     sigma_y_range,
+                                     rotation_range,
+                                     noise_range=None,
+                                     is_isotropic=True):
+    """Randomly generate bivariate isotropic or anisotropic Gaussian kernels.
+    In the isotropic mode, only `sigma_x_range` is used. `sigma_y_range` and
+    `rotation_range` is ignored.
+    Args:
+        kernel_size (int): The size of the kernel.
+        sigma_x_range (tuple): The range of the standard deviation along the
+            horizontal direction. Default: [0.6, 5]
+        sigma_y_range (tuple): The range of the standard deviation along the
+            vertical direction. Default: [0.6, 5]
+        rotation_range (tuple): Range of rotation in radian.
+        noise_range (tuple, optional): Multiplicative kernel noise.
+            Default: None.
+        is_isotropic (bool, optional): Whether to use an isotropic kernel.
+            Default: True.
+    Returns:
+        kernel (ndarray): The kernel whose parameters are sampled from the
+            specified range.
+    """
+    assert kernel_size % 2 == 1, 'Kernel size must be an odd number.'
+    assert sigma_x_range[0] <= sigma_x_range[1], 'Wrong sigma_x_range.'
+    sigma_x = random.uniform(sigma_x_range[0], sigma_x_range[1])
+    if is_isotropic is False:
+        assert sigma_y_range[0] <= sigma_y_range[1], 'Wrong sigma_y_range.'
+        assert rotation_range[0] <= rotation_range[1], 'Wrong rotation_range.'
+        sigma_y = random.uniform(sigma_y_range[0], sigma_y_range[1])
+        rotation = random.uniform(rotation_range[0], rotation_range[1])
+    else:
+        sigma_y = sigma_x
+        rotation = 0
+    kernel = bivariate_gaussian(
+        kernel_size, sigma_x, sigma_y, rotation, is_isotropic=is_isotropic)
+    # add multiplicative noise
+    if noise_range is not None:
+        assert noise_range[0] <= noise_range[1], 'Wrong noise range.'
+        noise = torch.FloatTensor(
+            *(kernel.shape)).uniform_(noise_range[0], noise_range[1]).numpy()
+        kernel = kernel * noise
+    kernel = kernel / np.sum(kernel)
+    return kernel
+def random_bivariate_generalized_gaussian_kernel(kernel_size,
+                                                 sigma_x_range,
+                                                 sigma_y_range,
+                                                 rotation_range,
+                                                 beta_range,
+                                                 noise_range=None,
+                                                 is_isotropic=True):
+    """Randomly generate bivariate generalized Gaussian kernels.
+    In the isotropic mode, only `sigma_x_range` is used. `sigma_y_range` and
+    `rotation_range` is ignored.
+    Args:
+        kernel_size (int): The size of the kernel.
+        sigma_x_range (tuple): The range of the standard deviation along the
+            horizontal direction. Default: [0.6, 5]
+        sigma_y_range (tuple): The range of the standard deviation along the
+            vertical direction. Default: [0.6, 5]
+        rotation_range (tuple): Range of rotation in radian.
+        beta_range (float): The range of the shape parameter, beta = 1 is the
+            normal distribution.
+        noise_range (tuple, optional): Multiplicative kernel noise.
+            Default: None.
+        is_isotropic (bool, optional): Whether to use an isotropic kernel.
+            Default: True.
+    Returns:
+        kernel (ndarray):
+    """
+    assert kernel_size % 2 == 1, 'Kernel size must be an odd number.'
+    assert sigma_x_range[0] <= sigma_x_range[1], 'Wrong sigma_x_range.'
+    sigma_x = random.uniform(sigma_x_range[0], sigma_x_range[1])
+    if is_isotropic is False:
+        assert sigma_y_range[0] <= sigma_y_range[1], 'Wrong sigma_y_range.'
+        assert rotation_range[0] <= rotation_range[1], 'Wrong rotation_range.'
+        sigma_y = random.uniform(sigma_y_range[0], sigma_y_range[1])
+        rotation = random.uniform(rotation_range[0], rotation_range[1])
+    else:
+        sigma_y = sigma_x
+        rotation = 0
+    # assume beta_range[0] <= 1 <= beta_range[1]
+    if random.random() <= 0.5:
+        beta = random.uniform(beta_range[0], 1)
+    else:
+        beta = random.uniform(1, beta_range[1])
+    kernel = bivariate_generalized_gaussian(
+        kernel_size,
+        sigma_x,
+        sigma_y,
+        rotation,
+        beta,
+        is_isotropic=is_isotropic)
+    # add multiplicative noise
+    if noise_range is not None:
+        assert noise_range[0] <= noise_range[1], 'Wrong noise range.'
+        noise = torch.FloatTensor(
+            *(kernel.shape)).uniform_(noise_range[0], noise_range[1]).numpy()
+        kernel = kernel * noise
+    kernel = kernel / np.sum(kernel)
+    return kernel
+def random_bivariate_plateau_kernel(kernel_size,
+                                    sigma_x_range,
+                                    sigma_y_range,
+                                    rotation_range,
+                                    beta_range,
+                                    noise_range=None,
+                                    is_isotropic=True):
+    """Randomly generate bivariate plateau kernels.
+    In the isotropic mode, only `sigma_x_range` is used. `sigma_y_range` and
+    `rotation_range` is ignored.
+    Args:
+        kernel_size (int): The size of the kernel.
+        sigma_x_range (tuple): The range of the standard deviation along the
+            horizontal direction. Default: [0.6, 5]
+        sigma_y_range (tuple): The range of the standard deviation along the
+            vertical direction. Default: [0.6, 5]
+        rotation_range (tuple): Range of rotation in radian.
+        beta_range (float): The range of the shape parameter, beta = 1 is the
+            normal distribution.
+        noise_range (tuple, optional): Multiplicative kernel noise.
+            Default: None.
+        is_isotropic (bool, optional): Whether to use an isotropic kernel.
+            Default: True.
+    Returns:
+        kernel (ndarray):
+    """
+    assert kernel_size % 2 == 1, 'Kernel size must be an odd number.'
+    assert sigma_x_range[0] <= sigma_x_range[1], 'Wrong sigma_x_range.'
+    sigma_x = random.uniform(sigma_x_range[0], sigma_x_range[1])
+    if is_isotropic is False:
+        assert sigma_y_range[0] <= sigma_y_range[1], 'Wrong sigma_y_range.'
+        assert rotation_range[0] <= rotation_range[1], 'Wrong rotation_range.'
+        sigma_y = random.uniform(sigma_y_range[0], sigma_y_range[1])
+        rotation = random.uniform(rotation_range[0], rotation_range[1])
+    else:
+        sigma_y = sigma_x
+        rotation = 0
+    # TODO: this may be not proper
+    if random.random() <= 0.5:
+        beta = random.uniform(beta_range[0], 1)
+    else:
+        beta = random.uniform(1, beta_range[1])
+    kernel = bivariate_plateau(
+        kernel_size,
+        sigma_x,
+        sigma_y,
+        rotation,
+        beta,
+        is_isotropic=is_isotropic)
+    # add multiplicative noise
+    if noise_range is not None:
+        assert noise_range[0] <= noise_range[1], 'Wrong noise range.'
+        noise = torch.FloatTensor(
+            *(kernel.shape)).uniform_(noise_range[0], noise_range[1]).numpy()
+        kernel = kernel * noise
+    kernel = kernel / np.sum(kernel)
+    return kernel
+def random_circular_lowpass_kernel(omega_range, kernel_size, pad_to=0):
+    """ Generate a 2D Sinc filter
+    Reference: https://dsp.stackexchange.com/questions/58301/2-d-circularly-symmetric-low-pass-filter  # noqa
+    Args:
+        omega_range (tuple): The cutoff frequency in radian (pi is max).
+        kernel_size (int): The size of the kernel. It must be an odd number.
+        pad_to (int, optional): The size of the padded kernel. It must be odd
+            or zero. Default: 0.
+    Returns:
+        ndarray: The Sinc kernel with specified parameters.
+    """
+    err = np.geterr()
+    np.seterr(divide='ignore', invalid='ignore')
+    assert kernel_size % 2 == 1, 'Kernel size must be an odd number.'
+    omega = random.uniform(omega_range[0], omega_range[-1])
+    kernel = np.fromfunction(
+        lambda x, y: omega * special.j1(omega * np.sqrt(
+            (x - (kernel_size - 1) / 2)**2 + (y - (kernel_size - 1) / 2)**2)) /
+        (2 * np.pi * np.sqrt((x - (kernel_size - 1) / 2)**2 +
+                             (y - (kernel_size - 1) / 2)**2)),
+        [kernel_size, kernel_size])
+    kernel[(kernel_size - 1) // 2,
+           (kernel_size - 1) // 2] = omega**2 / (4 * np.pi)
+    kernel = kernel / np.sum(kernel)
+    if pad_to > kernel_size:
+        pad_size = (pad_to - kernel_size) // 2
+        kernel = np.pad(kernel, ((pad_size, pad_size), (pad_size, pad_size)))
+    np.seterr(**err)
+    return kernel
+def random_mixed_kernels(kernel_list,
+                         kernel_prob,
+                         kernel_size,
+                         sigma_x_range=[0.6, 5],
+                         sigma_y_range=[0.6, 5],
+                         rotation_range=[-np.pi, np.pi],
+                         beta_gaussian_range=[0.5, 8],
+                         beta_plateau_range=[1, 2],
+                         omega_range=[0, np.pi],
+                         noise_range=None):
+    """Randomly generate a kernel.
+    Args:
+        kernel_list (list): A list of kernel types. Choices are
+            'iso', 'aniso', 'skew', 'generalized_iso', 'generalized_aniso',
+            'plateau_iso', 'plateau_aniso', 'sinc'.
+        kernel_prob (list): The probability of choosing of the corresponding
+            kernel.
+        kernel_size (int): The size of the kernel.
+        sigma_x_range (list, optional): The range of the standard deviation
+            along  the horizontal direction. Default: (0.6, 5).
+        sigma_y_range (list, optional): The range of the standard deviation
+            along the vertical direction. Default: (0.6, 5).
+        rotation_range (list, optional): Range of rotation in radian.
+            Default: (-np.pi, np.pi).
+        beta_gaussian_range (list, optional): The range of the shape parameter
+            for generalized Gaussian. Default: (0.5, 8).
+        beta_plateau_range (list, optional): The range of the shape parameter
+            for plateau kernel. Default: (1, 2).
+        omega_range (list, optional): The range of omega used in Sinc kernel.
+            Default: (0, np.pi).
+        noise_range (list, optional): Multiplicative kernel noise.
+            Default: None.
+    Returns:
+        kernel (ndarray): The kernel whose parameters are sampled from the
+            specified range.
+    """
+    kernel_type = random.choices(kernel_list, weights=kernel_prob)[0]
+    if kernel_type == 'iso':
+        kernel = random_bivariate_gaussian_kernel(
+            kernel_size,
+            sigma_x_range,
+            sigma_y_range,
+            rotation_range,
+            noise_range=noise_range,
+            is_isotropic=True)
+    elif kernel_type == 'aniso':
+        kernel = random_bivariate_gaussian_kernel(
+            kernel_size,
+            sigma_x_range,
+            sigma_y_range,
+            rotation_range,
+            noise_range=noise_range,
+            is_isotropic=False)
+    elif kernel_type == 'generalized_iso':
+        kernel = random_bivariate_generalized_gaussian_kernel(
+            kernel_size,
+            sigma_x_range,
+            sigma_y_range,
+            rotation_range,
+            beta_gaussian_range,
+            noise_range=noise_range,
+            is_isotropic=True)
+    elif kernel_type == 'generalized_aniso':
+        kernel = random_bivariate_generalized_gaussian_kernel(
+            kernel_size,
+            sigma_x_range,
+            sigma_y_range,
+            rotation_range,
+            beta_gaussian_range,
+            noise_range=noise_range,
+            is_isotropic=False)
+    elif kernel_type == 'plateau_iso':
+        kernel = random_bivariate_plateau_kernel(
+            kernel_size,
+            sigma_x_range,
+            sigma_y_range,
+            rotation_range,
+            beta_plateau_range,
+            noise_range=None,
+            is_isotropic=True)
+    elif kernel_type == 'plateau_aniso':
+        kernel = random_bivariate_plateau_kernel(
+            kernel_size,
+            sigma_x_range,
+            sigma_y_range,
+            rotation_range,
+            beta_plateau_range,
+            noise_range=None,
+            is_isotropic=False)
+    elif kernel_type == 'sinc':
+        kernel = random_circular_lowpass_kernel(omega_range, kernel_size)
+    return kernel

spai/data/data_finetune.py ADDED Viewed

	@@ -0,0 +1,723 @@

+# SPDX-FileCopyrightText: Copyright (c) 2025 Centre for Research and Technology Hellas
+# and University of Amsterdam. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import collections
+import os
+import pathlib
+import random
+from functools import partial
+from typing import Any, Union, Optional, Iterable
+from collections.abc import Callable
+import albumentations as A
+import torchvision.transforms.functional
+from albumentations.augmentations.transforms import ImageCompressionType
+from albumentations.pytorch import ToTensorV2
+import numpy as np
+import torch
+from torch.utils.data import DataLoader
+from PIL import Image
+from timm.data.constants import IMAGENET_DEFAULT_MEAN, IMAGENET_DEFAULT_STD
+from timm.data import Mixup
+import cv2
+from torchvision.transforms.v2.functional import ten_crop, pad
+import filetype
+from spai.data import readers
+from spai.data import filestorage
+from spai import data_utils
+class CSVDataset(torch.utils.data.Dataset):
+    def __init__(
+        self,
+        csv_path: pathlib.Path,
+        csv_root_path: pathlib.Path,
+        split: str,
+        transform,
+        path_column: str = "image",
+        split_column: str = "split",
+        class_column: str = "class",
+        views: int = 1,
+        concatenate_views_horizontally: bool = False,
+        lmdb_storage: Optional[pathlib.Path] = None,
+        views_generator: Optional[Callable[[Image.Image], tuple[Image.Image, ...]]] = None
+    ):
+        super().__init__()
+        self.csv_path: pathlib.Path = csv_path
+        self.csv_root_path: pathlib.Path = csv_root_path
+        self.split: str = split
+        self.path_column: str = path_column
+        self.split_column: str = split_column
+        self.class_column: str = class_column
+        self.transform = transform
+        self.views: int = views
+        self.views_generator: Optional[
+            Callable[[Image.Image], tuple[Image.Image, ...]]] = views_generator
+        self.concatenate_views_horizontally: bool = concatenate_views_horizontally
+        self.lmdb_storage: Optional[pathlib.Path] = lmdb_storage
+        # Reader to be used for data loading. Its creation is deferred
+        self.data_reader: Optional[readers.DataReader] = None
+        if split not in ["train", "val", "test"]:
+            raise RuntimeError(f"Unsupported split: {split}")
+        # Path of the CSV file is expected to be absolute.
+        reader = readers.FileSystemReader(pathlib.Path("/"))
+        self.entries: list[dict[str, Any]] = reader.read_csv_file(str(self.csv_path))
+        self.entries = [e for e in self.entries if e[self.split_column] == self.split]
+        self.num_classes: int = len(
+            collections.Counter([e[self.class_column] for e in self.entries]).keys()
+        )
+    def __len__(self):
+        return len(self.entries)
+    def __getitem__(self, idx: int) -> tuple[torch.Tensor, np.ndarray, int]:
+        """Returns the requested image sample from the dataset.
+        :returns: A tuple containing the image tensor, the labels numpy array and the
+         index in dataset.
+            Image tensor: (V x 3 x H x W) where V is the number of augmented views.
+            Label array:  (1, )
+            Index
+        """
+        # Defer the creation of the data reader until the first read operation in order to
+        # properly handle the spawning of multiple processes by DataLoader, where each one
+        # should contain a separate reader object.
+        if self.data_reader is None:
+            self._create_data_reader()
+        # Load sample.
+        img_obj: Image.Image = self.data_reader.load_image(
+            self.entries[idx][self.path_column], channels=3
+        )
+        label: int = int(self.entries[idx][self.class_column])
+        # Generate multiple views of an image either through a provided views generation
+        # function or through multiple augmentations of the image.
+        if self.views_generator is not None:
+            augmented_views: tuple[Image.Image, ...] = self.views_generator(img_obj)
+            augmented_views: list[np.ndarray] = [np.array(v) for v in augmented_views]
+            augmented_views: list[torch.Tensor] = [
+                self.transform(image=v)["image"] for v in augmented_views
+            ]
+        else:
+            img: np.ndarray = np.array(img_obj)
+            augmented_views: list[torch.Tensor] = []
+            for _ in range(self.views):
+                augmented_views.append(self.transform(image=img)["image"])
+        # Either concatenate the views in a single big image, or provide them stacked
+        # into a new tensor dimension.
+        if self.concatenate_views_horizontally:
+            augmented_img: torch.Tensor = torch.cat(augmented_views, dim=-1)
+            augmented_img = augmented_img.unsqueeze(dim=0)
+        else:
+            augmented_img: torch.Tensor = torch.stack(augmented_views, dim=0)
+        # Cleanup resources.
+        img_obj.close()
+        return augmented_img, np.array(label, dtype=float), idx
+    def get_classes_num(self) -> int:
+        return self.num_classes
+    def get_dataset_root_path(self) -> pathlib.Path:
+        if self.lmdb_storage is not None:
+            return self.lmdb_storage
+        else:
+            return self.csv_root_path
+    def update_dataset_csv(
+        self,
+        column_name: str,
+        values: dict[int, Any],
+        export_dir: Optional[pathlib.Path] = None
+    ) -> None:
+        for idx, v in values.items():
+            self.entries[idx][column_name] = v
+        # Make sure that a valid value for the updated column exists for all entries.
+        for e in self.entries:
+            if column_name not in e:
+                e[column_name] = ""
+        if export_dir:
+            export_path: pathlib.Path = export_dir / self.csv_path.name
+            data_utils.write_csv_file(self.entries, export_path, delimiter=",")
+    def _create_data_reader(self) -> None:
+        # Limit the number of OpenCV threads to 2 to utilize multiple processes. Otherwise,
+        # each process spawns a number of threads equal to the number of logical cores and
+        # the overall performance gets worse due to threads congestion.
+        cv2.setNumThreads(1)
+        if self.lmdb_storage is None:
+            self.data_reader: readers.FileSystemReader = readers.FileSystemReader(
+                pathlib.Path(self.csv_root_path)
+            )
+        else:
+            self.data_reader: readers.LMDBFileStorageReader = readers.LMDBFileStorageReader(
+                filestorage.LMDBFileStorage(self.lmdb_storage, read_only=True)
+            )
+class CSVDatasetTriplet(torch.utils.data.Dataset):
+    def __init__(
+        self,
+        csv_path: pathlib.Path,
+        csv_root_path: pathlib.Path,
+        split: str,
+        transform,
+        path_column: str = "image",
+        split_column: str = "split",
+        class_column: str = "class",
+        lmdb_storage: Optional[pathlib.Path] = None
+    ):
+        super().__init__()
+        self.csv_path: pathlib.Path = csv_path
+        self.csv_root_path: pathlib.Path = csv_root_path
+        self.split: str = split
+        self.path_column: str = path_column
+        self.split_column: str = split_column
+        self.class_column: str = class_column
+        self.transform = transform
+        self.lmdb_storage: Optional[pathlib.Path] = lmdb_storage
+        # Reader to be used for data loading. Its creation is deferred
+        self.data_reader: Optional[readers.DataReader] = None
+        if split not in ["train", "val", "test"]:
+            raise RuntimeError(f"Unsupported split: {split}")
+        # Path of the CSV file is expected to be absolute.
+        reader = readers.FileSystemReader(pathlib.Path("/"))
+        self.entries: list[dict[str, Any]] = reader.read_csv_file(str(self.csv_path))
+        self.entries = [e for e in self.entries if e[self.split_column] == self.split]
+        self.num_classes: int = len(
+            collections.Counter([e[self.class_column] for e in self.entries]).keys()
+        )
+        # Save paths that will be accessed by different dataloaders as numpy arrays in
+        # order to avoid copy-on-read of python objects, and thus child processes to
+        # take huge amounts of memory.
+        self.anchor_v: Optional[np.ndarray] = None
+        self.anchor_o: Optional[np.ndarray] = None
+        self.positive_v: Optional[np.ndarray] = None
+        self.positive_o: Optional[np.ndarray] = None
+        self.negative_v: Optional[np.ndarray] = None
+        self.negative_o: Optional[np.ndarray] = None
+        self.triplets_num: Optional[int] = None
+        self.generate_triplets()
+    def __len__(self) -> int:
+        return self.triplets_num
+    def __getitem__(self, idx) -> tuple[torch.Tensor, torch.Tensor, torch.Tensor]:
+        """Returns the triplet with the specified index.
+        :returns: A tuple in the form of (anchor_img, positive_img, negative_img)
+        """
+        # Defer the creation of the data reader until the first read operation in order to
+        # properly handle the spawning of multiple processes by DataLoader, where each one
+        # should contain a separate reader object.
+        if self.data_reader is None:
+            self._create_data_reader()
+        anchor_path: str = sequence_to_string(unpack_sequence(self.anchor_v, self.anchor_o, idx))
+        positive_path: str = sequence_to_string(
+            unpack_sequence(self.positive_v, self.positive_o, idx))
+        negative_path: str = sequence_to_string(
+            unpack_sequence(self.negative_v, self.negative_o, idx))
+        anchor_img_obj: Image.Image = self.data_reader.load_image(anchor_path, channels=3)
+        positive_img_obj: Image.Image = self.data_reader.load_image(positive_path, channels=3)
+        negative_img_obj: Image.Image = self.data_reader.load_image(negative_path, channels=3)
+        anchor_img: np.ndarray = np.array(anchor_img_obj)
+        positive_img: np.ndarray = np.array(positive_img_obj)
+        negative_img: np.ndarray = np.array(negative_img_obj)
+        anchor_img_obj.close()
+        positive_img_obj.close()
+        negative_img_obj.close()
+        return (self.transform(image=anchor_img)["image"],
+                self.transform(image=positive_img)["image"],
+                self.transform(image=negative_img)["image"])
+    def get_classes_num(self) -> int:
+        return self.num_classes
+    def get_dataset_root_path(self) -> pathlib.Path:
+        if self.lmdb_storage is not None:
+            return self.lmdb_storage
+        else:
+            return self.csv_root_path
+    def generate_triplets(self) -> None:
+        # Separate the entries into groups of each class.
+        entries_per_class: dict[int, list[dict[str, Any]]] = {
+            i: [] for i in range(self.num_classes)
+        }
+        for e in self.entries:
+            entries_per_class[int(e[self.class_column])].append(e)
+        triplets: list[tuple[dict[str,Any], dict[str, Any], dict[str, Any]]] = []
+        for class_id, class_group in entries_per_class.items():
+            class_group: list[dict[str, Any]] = list(class_group)
+            rest_groups: list[list[dict[str, Any]]] = list(entries_per_class.values())
+            del rest_groups[class_id]
+            for i, e in enumerate(class_group):
+                negative_sample: dict[str, Any] = random.choice(random.choice(rest_groups))
+                positive_sample: dict[str, Any] = random.choice(class_group)
+                while e == positive_sample:
+                    positive_sample: dict[str, Any] = random.choice(class_group)
+                triplets.append((e, positive_sample, negative_sample))
+        self.anchor_v, self.anchor_o = pack_sequences(
+            [string_to_sequence(t[0][self.path_column]) for t in triplets]
+        )
+        self.positive_v, self.positive_o = pack_sequences(
+            [string_to_sequence(t[1][self.path_column]) for t in triplets]
+        )
+        self.negative_v, self.negative_o = pack_sequences(
+            [string_to_sequence(t[2][self.path_column]) for t in triplets]
+        )
+        self.anchor_labels: np.ndarray = np.array([int(t[0][self.class_column]) for t in triplets])
+        self.triplets_num = len(triplets)
+    def _create_data_reader(self) -> None:
+        # Limit the number of OpenCV threads to 2 to utilize multiple processes. Otherwise,
+        # each process spawns a number of threads equal to the number of logical cores and
+        # the overall performance gets worse due to threads congestion.
+        cv2.setNumThreads(1)
+        if self.lmdb_storage is None:
+            self.data_reader: readers.FileSystemReader = readers.FileSystemReader(
+                pathlib.Path(self.csv_root_path)
+            )
+        else:
+            self.data_reader: readers.LMDBFileStorageReader = readers.LMDBFileStorageReader(
+                filestorage.LMDBFileStorage(self.lmdb_storage, read_only=True)
+            )
+def build_loader_finetune(config, logger):
+    config.defrost()
+    dataset_train, config.MODEL.NUM_CLASSES = build_dataset(
+        config.DATA.DATA_PATH,
+        config.DATA.CSV_ROOT,
+        config=config,
+        split_name="train",
+        logger=logger
+    )
+    config.freeze()
+    dataset_val, _ = build_dataset(
+        config.DATA.DATA_PATH,
+        config.DATA.CSV_ROOT,
+        config=config,
+        split_name="val",
+        logger=logger
+    )
+    logger.info(f"Train images: {len(dataset_train)} | Validation images: {len(dataset_val)}")
+    logger.info(f"Train Images Source: {dataset_train.get_dataset_root_path()}")
+    logger.info(f"Validation Images Source: {dataset_val.get_dataset_root_path()}")
+    data_loader_train = DataLoader(
+        dataset_train,
+        batch_size=config.DATA.BATCH_SIZE,
+        num_workers=config.DATA.NUM_WORKERS,
+        pin_memory=config.DATA.PIN_MEMORY,
+        drop_last=True,
+        shuffle=True,
+        prefetch_factor=config.DATA.PREFETCH_FACTOR
+    )
+    data_loader_val = DataLoader(
+        dataset_val,
+        batch_size=config.DATA.VAL_BATCH_SIZE or config.DATA.BATCH_SIZE,
+        num_workers=config.DATA.NUM_WORKERS,
+        pin_memory=config.DATA.PIN_MEMORY,
+        drop_last=False,
+        prefetch_factor=config.DATA.VAL_PREFETCH_FACTOR  or config.DATA.PREFETCH_FACTOR,
+        collate_fn=(torch.utils.data.default_collate
+                    if not config.MODEL.RESOLUTION_MODE == "arbitrary"
+                    else image_enlisting_collate_fn)
+    )
+    # Setup mixup / cutmix
+    mixup_fn = None
+    mixup_active: bool = (config.AUG.MIXUP > 0
+                          or config.AUG.CUTMIX > 0.
+                          or config.AUG.CUTMIX_MINMAX is not None)
+    if mixup_active:
+        mixup_fn = Mixup(
+            mixup_alpha=config.AUG.MIXUP,
+            cutmix_alpha=config.AUG.CUTMIX,
+            cutmix_minmax=config.AUG.CUTMIX_MINMAX,
+            prob=config.AUG.MIXUP_PROB,
+            switch_prob=config.AUG.MIXUP_SWITCH_PROB,
+            mode=config.AUG.MIXUP_MODE,
+            label_smoothing=config.MODEL.LABEL_SMOOTHING,
+            num_classes=config.MODEL.NUM_CLASSES
+        )
+    return dataset_train, dataset_val, data_loader_train, data_loader_val, mixup_fn
+def build_loader_test(
+    config,
+    logger,
+    split: str = "test",
+    dummy_csv_dir: Optional[pathlib.Path] = None,
+) -> tuple[list[str], list[torch.utils.data.Dataset], list[torch.utils.data.DataLoader]]:
+    # Obtain the root directory for each test input (either a CSV file or a directory).
+    input_root_paths: list[pathlib.Path]
+    if len(config.DATA.TEST_DATA_CSV_ROOT) > 1:
+        input_root_paths = [pathlib.Path(p) for p in config.DATA.TEST_DATA_CSV_ROOT]
+    elif len(config.DATA.TEST_DATA_CSV_ROOT) == 1:
+        input_root_paths = [pathlib.Path(config.DATA.TEST_DATA_CSV_ROOT[0])
+                            for _ in config.DATA.TEST_DATA_PATH]
+    else:
+        input_root_paths = [pathlib.Path(input_path).parent
+                            for input_path in config.DATA.TEST_DATA_PATH]
+    # If some input is a directory, create a dummy csv file for it.
+    csv_paths: list[pathlib.Path] = []
+    csv_root_paths: list[pathlib.Path] = []
+    for input_path, input_root_path in zip(config.DATA.TEST_DATA_PATH, input_root_paths):
+        input_path: pathlib.Path = pathlib.Path(input_path)
+        if input_path.is_dir():
+            # Create a dummy csv and point directories
+            if dummy_csv_dir is None:
+                dummy_csv_dir = pathlib.Path("./outputs")
+            entries: list[dict[str, str]] = [
+                {
+                    "image": str(file_path.name),
+                    "split": split,
+                    "class": "1"  # TODO: Remove csv requirement for dummy ground-truth.
+                }
+                for file_path in input_path.iterdir() if filetype.is_image(file_path)
+            ]
+            dummy_csv_path: pathlib.Path = dummy_csv_dir / f"{input_path.stem}.csv"
+            data_utils.write_csv_file(entries, dummy_csv_path, delimiter=",")
+            csv_paths.append(dummy_csv_path.absolute())
+            csv_root_paths.append(input_path.absolute())  # Paths in CSV are relative to input dir.
+        else:
+            csv_paths.append(input_path.absolute())
+            csv_root_paths.append(input_root_path.absolute())
+    # Obtain the separate testing sets and their names.
+    test_datasets: list[CSVDataset] = []
+    test_datasets_names: list[str] = []
+    num_classes_per_dataset: list[int]  = []
+    for csv_path, csv_root_path in zip(csv_paths, csv_root_paths):
+        csv_path: pathlib.Path = pathlib.Path(csv_path)
+        dataset: CSVDataset
+        dataset, num_classes = build_dataset(csv_path, csv_root_path, config, split, logger)
+        test_datasets.append(dataset)
+        test_datasets_names.append(csv_path.stem)
+        num_classes_per_dataset.append(num_classes)
+    # Check that the number of classes match among all test sets.
+    unique_number_of_classes: list[int] = list(collections.Counter(num_classes_per_dataset).keys())
+    if len(unique_number_of_classes) > 1:
+        raise RuntimeError(
+            f"Encountered different number of classes among test sets: {unique_number_of_classes}"
+        )
+    for dataset, dataset_name in zip(test_datasets, test_datasets_names):
+        logger.info(f"Dataset \'{dataset_name}\' | Split: {split} | Total images: {len(dataset)} | "
+                    f"Source: {dataset.get_dataset_root_path()}")
+    # Create the corresponding data loaders.
+    force_cpu: bool = os.environ.get("SPAI_FORCE_CPU", "0") == "1"
+    cuda_visible: str = os.environ.get("CUDA_VISIBLE_DEVICES", "")
+    use_cuda: bool = (not force_cpu) and (cuda_visible != "")
+    test_num_workers: int = config.DATA.NUM_WORKERS if use_cuda else min(config.DATA.NUM_WORKERS, 2)
+    test_pin_memory: bool = config.DATA.PIN_MEMORY if use_cuda else False
+    test_data_loaders: list[torch.utils.data.DataLoader] = [
+        DataLoader(
+            dataset,
+            batch_size=config.DATA.TEST_BATCH_SIZE or config.DATA.BATCH_SIZE,
+            num_workers=test_num_workers,
+            pin_memory=test_pin_memory,
+            drop_last=False,
+            prefetch_factor=config.DATA.TEST_PREFETCH_FACTOR or config.DATA.PREFETCH_FACTOR,
+            collate_fn=(torch.utils.data.default_collate
+                        if not config.MODEL.RESOLUTION_MODE == "arbitrary"
+                        else image_enlisting_collate_fn)
+        )
+        for dataset in test_datasets
+    ]
+    return test_datasets_names, test_datasets, test_data_loaders
+def build_dataset(
+    csv_path: pathlib.Path,
+    csv_root_dir: pathlib.Path,
+    config,
+    split_name: str,
+    logger,
+) -> tuple[Union[CSVDataset, CSVDatasetTriplet], int]:
+    if split_name not in ["train", "val", "test"]:
+        raise RuntimeError(f"Unsupported split: {split_name}")
+    transform = build_transform(split_name == "train", config)
+    logger.info(f"Data transform | mode: {config.TRAIN.MODE} | split: {split_name}:\n{transform}")
+    if split_name == "train" and config.TRAIN.LOSS == "triplet":
+        dataset = CSVDatasetTriplet(
+            csv_path,
+            csv_root_dir,
+            split=split_name,
+            transform=transform,
+            lmdb_storage=pathlib.Path(config.DATA.LMDB_PATH) if config.DATA.LMDB_PATH else None
+        )
+    elif split_name == "train" and config.TRAIN.LOSS == "supcont":
+        assert config.DATA.AUGMENTED_VIEWS > 1, "SupCon loss requires at least 2 views."
+        dataset = CSVDataset(
+            csv_path,
+            csv_root_dir,
+            split=split_name,
+            transform=transform,
+            views=config.DATA.AUGMENTED_VIEWS,
+            lmdb_storage=pathlib.Path(config.DATA.LMDB_PATH) if config.DATA.LMDB_PATH else None
+        )
+    elif split_name == "train" and config.MODEL.RESOLUTION_MODE == "arbitrary":
+        dataset = CSVDataset(
+            csv_path,
+            csv_root_dir,
+            split=split_name,
+            transform=transform,
+            views=config.DATA.AUGMENTED_VIEWS,
+            concatenate_views_horizontally=True,
+            lmdb_storage=pathlib.Path(config.DATA.LMDB_PATH) if config.DATA.LMDB_PATH else None
+        )
+    else:
+        views_generator: Optional[Callable[[Image.Image], tuple[Image.Image, ...]]]
+        if config.TEST.VIEWS_GENERATION_APPROACH == "tencrop":
+            def safe_ten_crop(img: Image.Image) -> tuple[Image.Image, ...]:
+                width = img.width
+                height = img.height
+                left_padding: int = max((config.DATA.IMG_SIZE - width) // 2, 0)
+                right_padding: int = max(
+                    (config.DATA.IMG_SIZE - width) // 2
+                    + (((config.DATA.IMG_SIZE - width) % 2) if config.DATA.IMG_SIZE > width else 0),
+                    0
+                )
+                top_padding: int = max((config.DATA.IMG_SIZE - height) // 2, 0)
+                bottom_padding: int = max(
+                    (config.DATA.IMG_SIZE - height) // 2
+                    + (((config.DATA.IMG_SIZE - height) % 2) if config.DATA.IMG_SIZE > height else 0),
+                    0
+                )
+                img = pad(img, [left_padding, top_padding, right_padding, bottom_padding])
+                return ten_crop(img, size=config.DATA.IMG_SIZE)
+            views_generator = safe_ten_crop
+        elif config.TEST.VIEWS_GENERATION_APPROACH is None:
+            views_generator = None
+        else:
+            raise TypeError(f"{config.TEST.VIEW_GENERATION_APPROACH} is not a supported "
+                            f"view generation approach.")
+        dataset = CSVDataset(
+            csv_path,
+            csv_root_dir,
+            split=split_name,
+            transform=transform,
+            lmdb_storage=pathlib.Path(config.DATA.LMDB_PATH) if config.DATA.LMDB_PATH else None,
+            views_generator=views_generator
+        )
+    num_classes: int = dataset.get_classes_num()
+    return dataset, num_classes
+def build_transform(is_train, config) -> Callable[[np.ndarray], np.ndarray]:
+    # resize_im: bool = config.DATA.IMG_SIZE > 32
+    # # this should always dispatch to transforms_imagenet_train
+    # transform = create_transform(
+    #     input_size=config.DATA.IMG_SIZE,
+    #     is_training=True,
+    #     color_jitter=config.AUG.COLOR_JITTER if config.AUG.COLOR_JITTER > 0 else None,
+    #     auto_augment=config.AUG.AUTO_AUGMENT if config.AUG.AUTO_AUGMENT != 'none' else None,
+    #     re_prob=config.AUG.REPROB,
+    #     re_mode=config.AUG.REMODE,
+    #     re_count=config.AUG.RECOUNT,
+    #     interpolation=config.DATA.INTERPOLATION,
+    # )
+    # if not resize_im:
+    #     # replace RandomResizedCropAndInterpolation with
+    #     # RandomCrop
+    #     transform.transforms[0] = transforms.RandomCrop(config.DATA.IMG_SIZE, padding=4)
+    # transform.transforms.insert(0, torchvision.transforms.v2.JPEG((50, 100)))
+    # transform.transforms.insert(4, torchvision.transforms.GaussianBlur(kernel_size=(3, 9), sigma=(0.01, 0.5)))
+    if is_train:  # Training augmentations
+        transforms_list = []
+        if config.AUG.MIN_CROP_AREA == config.AUG.MAX_CROP_AREA:
+            transforms_list.append(
+                A.PadIfNeeded(min_height=config.DATA.IMG_SIZE, min_width=config.DATA.IMG_SIZE)
+            )
+            transforms_list.append(
+                A.RandomCrop(height=config.DATA.IMG_SIZE, width=config.DATA.IMG_SIZE)
+            )
+        else:
+            transforms_list.append(
+                A.RandomResizedCrop(size=(config.DATA.IMG_SIZE, config.DATA.IMG_SIZE),
+                                    scale=(config.AUG.MIN_CROP_AREA, config.AUG.MAX_CROP_AREA))
+            )
+        transforms_list.extend([
+            A.HorizontalFlip(p=config.AUG.HORIZONTAL_FLIP_PROB),
+            A.VerticalFlip(p=config.AUG.VERTICAL_FLIP_PROB),
+            A.Rotate(limit=config.AUG.ROTATION_DEGREES,
+                     crop_border=True,
+                     p=config.AUG.ROTATION_PROB)
+        ])
+        if config.AUG.ROTATION_PROB > .0:
+            # Rotation with crop_border set to True leads to images smaller than the target
+            # size. So, restore the target size.
+            transforms_list.append(
+                A.Resize(height=config.DATA.IMG_SIZE, width=config.DATA.IMG_SIZE)
+            )
+        transforms_list.extend([
+            A.GaussianBlur(blur_limit=(3, 9),
+                           sigma_limit=(0.01, 0.5),
+                           p=config.AUG.GAUSSIAN_BLUR_PROB),
+            A.GaussNoise(p=config.AUG.GAUSSIAN_NOISE_PROB),
+            A.ColorJitter(
+                p=config.AUG.COLOR_JITTER,
+                brightness=config.AUG.COLOR_JITTER_BRIGHTNESS_RANGE,
+                contrast=config.AUG.COLOR_JITTER_CONTRAST_RANGE,
+                saturation=config.AUG.COLOR_JITTER_SATURATION_RANGE,
+                hue=config.AUG.COLOR_JITTER_HUE_RANGE,
+            ),
+            A.Sharpen(p=config.AUG.SHARPEN_PROB,
+                      alpha=config.AUG.SHARPEN_ALPHA_RANGE,
+                      lightness=config.AUG.SHARPEN_LIGHTNESS_RANGE),
+            A.ImageCompression(quality_lower=config.AUG.JPEG_MIN_QUALITY,
+                               quality_upper=config.AUG.JPEG_MAX_QUALITY,
+                               compression_type=ImageCompressionType.JPEG,
+                               p=config.AUG.JPEG_COMPRESSION_PROB),
+            A.ImageCompression(quality_lower=config.AUG.WEBP_MIN_QUALITY,
+                               quality_upper=config.AUG.WEBP_MAX_QUALITY,
+                               compression_type=ImageCompressionType.WEBP,
+                               p=config.AUG.WEBP_COMPRESSION_PROB),
+        ])
+        if config.MODEL.REQUIRED_NORMALIZATION == "imagenet":
+            transforms_list.append(
+                A.Normalize(mean=IMAGENET_DEFAULT_MEAN, std=IMAGENET_DEFAULT_STD)
+            )
+        elif config.MODEL.REQUIRED_NORMALIZATION == "positive_0_1":
+            transforms_list.append(
+                A.Normalize(mean=0., std=1.)
+            )
+        else:
+            raise RuntimeError(f"Unsupported Normalization: {config.MODEL.REQUIRED_NORMALIZATION}")
+        transforms_list.append(ToTensorV2())
+        transform = A.Compose(transforms_list)
+    else:  # Inference augmentations
+        transforms_list = [
+            A.ImageCompression(quality_lower=config.TEST.JPEG_QUALITY,
+                               quality_upper=config.TEST.JPEG_QUALITY,
+                               compression_type=ImageCompressionType.JPEG,
+                               p=1.0 if config.TEST.JPEG_COMPRESSION else .0),
+            A.ImageCompression(quality_lower=config.TEST.WEBP_QUALITY,
+                               quality_upper=config.TEST.WEBP_QUALITY,
+                               compression_type=ImageCompressionType.WEBP,
+                               p=1.0 if config.TEST.WEBP_COMPRESSION else .0),
+            A.GaussianBlur(blur_limit=(config.TEST.GAUSSIAN_BLUR_KERNEL_SIZE,
+                                       config.TEST.GAUSSIAN_BLUR_KERNEL_SIZE),
+                           sigma_limit=0,
+                           p=1.0 if config.TEST.GAUSSIAN_BLUR else .0),
+            A.GaussNoise(var_limit=(config.TEST.GAUSSIAN_NOISE_SIGMA**2,
+                                    config.TEST.GAUSSIAN_NOISE_SIGMA**2),
+                         p=1.0 if config.TEST.GAUSSIAN_NOISE else .0),
+            A.RandomScale(scale_limit=(config.TEST.SCALE_FACTOR-1, config.TEST.SCALE_FACTOR-1),
+                          p=1.0 if config.TEST.SCALE else .0)
+        ]
+        if config.TEST.MAX_SIZE is not None:
+            transforms_list.append(A.SmallestMaxSize(max_size=config.TEST.MAX_SIZE))
+        if config.TEST.ORIGINAL_RESOLUTION:
+            transforms_list.append(A.PadIfNeeded(min_height=config.DATA.IMG_SIZE,
+                                                 min_width=config.DATA.IMG_SIZE))
+        elif config.TEST.CROP:
+            transforms_list.append(A.PadIfNeeded(min_height=config.DATA.IMG_SIZE,
+                                                 min_width=config.DATA.IMG_SIZE))
+            transforms_list.append(A.CenterCrop(height=config.DATA.IMG_SIZE,
+                                                width=config.DATA.IMG_SIZE))
+        else:
+            transforms_list.append(A.Resize(config.DATA.IMG_SIZE, config.DATA.IMG_SIZE))
+        if config.MODEL.REQUIRED_NORMALIZATION == "imagenet":
+            transforms_list.append(A.Normalize(mean=IMAGENET_DEFAULT_MEAN, std=IMAGENET_DEFAULT_STD))
+        elif config.MODEL.REQUIRED_NORMALIZATION == "positive_0_1":
+            transforms_list.append(A.Normalize(mean=0., std=1.))
+        else:
+            raise RuntimeError(f"Unsupported Normalization: {config.MODEL.REQUIRED_NORMALIZATION}")
+        transforms_list.append(ToTensorV2())
+        transform = A.Compose(transforms_list)
+    return transform
+def string_to_sequence(s: str, dtype=np.int32) -> np.ndarray:
+    return np.array([ord(c) for c in s], dtype=dtype)
+def sequence_to_string(seq: np.ndarray) -> str:
+    return ''.join([chr(c) for c in seq])
+def pack_sequences(seqs: Union[np.ndarray, list]) -> (np.ndarray, np.ndarray):
+    values = np.concatenate(seqs, axis=0)
+    offsets = np.cumsum([len(s) for s in seqs])
+    return values, offsets
+def unpack_sequence(values: np.ndarray, offsets: np.ndarray, index: int) -> np.ndarray:
+    off1 = offsets[index]
+    if index > 0:
+        off0 = offsets[index - 1]
+    elif index == 0:
+        off0 = 0
+    else:
+        raise ValueError(index)
+    return values[off0:off1]
+def image_enlisting_collate_fn(
+    batch: Iterable[tuple[torch.Tensor, np.ndarray, int]]
+) -> tuple[list[torch.Tensor], torch.Tensor, torch.Tensor]:
+    """Collate function that enlists its entries."""
+    return (
+        [torch.utils.data.default_collate([s[0]]) for s in batch],
+        torch.utils.data.default_collate([s[1] for s in batch]),
+        torch.utils.data.default_collate([s[2] for s in batch]),
+    )

spai/data/data_mfm.py ADDED Viewed

	@@ -0,0 +1,131 @@

+import numpy as np
+import torch
+import torch.distributed as dist
+import torchvision.transforms as T
+from torch.utils.data import DataLoader, DistributedSampler
+from torch.utils.data._utils.collate import default_collate
+from torchvision.datasets import ImageFolder
+from timm.data.transforms import _pil_interp
+from .random_degradations import RandomBlur, RandomNoise
+class FreqMaskGenerator:
+    def __init__(self,
+                 input_size=224,
+                 mask_radius1=16,
+                 mask_radius2=999,
+                 sample_ratio=0.5):
+        self.input_size = input_size
+        self.mask_radius1 = mask_radius1
+        self.mask_radius2 = mask_radius2
+        self.sample_ratio = sample_ratio
+        self.mask = np.ones((self.input_size, self.input_size), dtype=int)
+        for y in range(self.input_size):
+            for x in range(self.input_size):
+                if ((x - self.input_size // 2) ** 2 + (y - self.input_size // 2) ** 2) >= self.mask_radius1 ** 2 \
+                        and ((x - self.input_size // 2) ** 2 + (y - self.input_size // 2) ** 2) < self.mask_radius2 ** 2:
+                    self.mask[y, x] = 0
+    def __call__(self):
+        rnd = torch.bernoulli(torch.tensor(self.sample_ratio, dtype=torch.float)).item()
+        if rnd == 0:  # high-pass
+            return 1 - self.mask
+        elif rnd == 1:  # low-pass
+            return self.mask
+        else:
+            raise ValueError
+class MFMTransform:
+    def __init__(self, config):
+        self.transform_img = T.Compose([
+            T.Lambda(lambda img: img.convert('RGB') if img.mode != 'RGB' else img),
+            T.RandomResizedCrop(config.DATA.IMG_SIZE, scale=(config.DATA.MIN_CROP_SCALE, 1.), interpolation=_pil_interp(config.DATA.INTERPOLATION)),
+            T.RandomHorizontalFlip(),
+        ])
+        self.filter_type = config.DATA.FILTER_TYPE
+        if config.MODEL.TYPE == 'swin':
+            model_patch_size = config.MODEL.SWIN.PATCH_SIZE
+        elif config.MODEL.TYPE == 'vit':
+            model_patch_size = config.MODEL.VIT.PATCH_SIZE
+        elif config.MODEL.TYPE == 'resnet':
+            model_patch_size = 1
+        else:
+            raise NotImplementedError
+        if config.DATA.FILTER_TYPE == 'deblur':
+            self.degrade_transform = RandomBlur(
+                params=dict(
+                    kernel_size=config.DATA.BLUR.KERNEL_SIZE,
+                    kernel_list=config.DATA.BLUR.KERNEL_LIST,
+                    kernel_prob=config.DATA.BLUR.KERNEL_PROB,
+                    sigma_x=config.DATA.BLUR.SIGMA_X,
+                    sigma_y=config.DATA.BLUR.SIGMA_Y,
+                    rotate_angle=config.DATA.BLUR.ROTATE_ANGLE,
+                    beta_gaussian=config.DATA.BLUR.BETA_GAUSSIAN,
+                    beta_plateau=config.DATA.BLUR.BETA_PLATEAU),
+            )
+        elif config.DATA.FILTER_TYPE == 'denoise':
+            self.degrade_transform = RandomNoise(
+                params=dict(
+                    noise_type=config.DATA.NOISE.TYPE,
+                    noise_prob=config.DATA.NOISE.PROB,
+                    gaussian_sigma=config.DATA.NOISE.GAUSSIAN_SIGMA,
+                    gaussian_gray_noise_prob=config.DATA.NOISE.GAUSSIAN_GRAY_NOISE_PROB,
+                    poisson_scale=config.DATA.NOISE.POISSON_SCALE,
+                    poisson_gray_noise_prob=config.DATA.NOISE.POISSON_GRAY_NOISE_PROB),
+            )
+        elif config.DATA.FILTER_TYPE == 'mfm':
+            self.freq_mask_generator = FreqMaskGenerator(
+                input_size=config.DATA.IMG_SIZE,
+                mask_radius1=config.DATA.MASK_RADIUS1,
+                mask_radius2=config.DATA.MASK_RADIUS2,
+                sample_ratio=config.DATA.SAMPLE_RATIO
+            )
+    def __call__(self, img):
+        img = self.transform_img(img)  # PIL Image (HxWxC, 0-255), no normalization
+        if self.filter_type in ['deblur', 'denoise']:
+            img_lq = np.array(img).astype(np.float32) / 255.
+            img_lq = self.degrade_transform(img_lq)
+            img_lq = torch.from_numpy(img_lq.transpose(2, 0, 1))
+        else:
+            img_lq = None
+        img = T.ToTensor()(img)  # Tensor (CxHxW, 0-1)
+        if self.filter_type == 'mfm':
+            mask = self.freq_mask_generator()
+        else:
+            mask = None
+        return img, img_lq, mask
+def collate_fn(batch):
+    if not isinstance(batch[0][0], tuple):
+        return default_collate(batch)
+    else:
+        batch_num = len(batch)
+        ret = []
+        for item_idx in range(len(batch[0][0])):
+            if batch[0][0][item_idx] is None:
+                ret.append(None)
+            else:
+                ret.append(default_collate([batch[i][0][item_idx] for i in range(batch_num)]))
+        ret.append(default_collate([batch[i][1] for i in range(batch_num)]))
+        return ret
+def build_loader_mfm(config, logger):
+    transform = MFMTransform(config)
+    logger.info(f'Pre-train data transform:\n{transform}')
+    dataset = ImageFolder(config.DATA.DATA_PATH, transform)
+    logger.info(f'Build dataset: train images = {len(dataset)}')
+    sampler = DistributedSampler(dataset, num_replicas=dist.get_world_size(), rank=dist.get_rank(), shuffle=True)
+    dataloader = DataLoader(dataset, config.DATA.BATCH_SIZE, sampler=sampler, num_workers=config.DATA.NUM_WORKERS, pin_memory=True, drop_last=True, collate_fn=collate_fn)
+    return dataloader

spai/data/filestorage.py ADDED Viewed

	@@ -0,0 +1,387 @@

+# SPDX-FileCopyrightText: Copyright (c) 2025 Centre for Research and Technology Hellas
+# and University of Amsterdam. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import csv
+import hashlib
+import io
+import logging
+import pathlib
+from collections import Counter
+from typing import Union, Optional
+import click
+import lmdb
+import tqdm
+import networkx as nx
+__version__: str = "0.1.0-alpha"
+__revision__: int = 2
+__author__: str = "Dimitrios Karageorgiou"
+__email__: str = "dkarageo@iti.gr"
+class LMDBFileStorage:
+    """A file storage for handling large datasets based on LMDB."""
+    def __init__(self,
+                 db_path: pathlib.Path,
+                 map_size: int = 1024*1024*1024*1024,  # 1TB
+                 read_only: bool = False,
+                 max_readers: int = 128):
+        self.db: lmdb.Environment = lmdb.open(
+            str(db_path),
+            map_size=map_size,
+            subdir=False,
+            readonly=read_only,
+            max_readers=max_readers,
+            lock=False,
+            sync=False
+        )
+    def open_file(self, file_id: str, mode: str = "r") -> Union[io.TextIOWrapper, io.BytesIO]:
+        """Returns a file-like stream of a file in the database."""
+        # with self.db.begin() as trans:
+        #     data: bytes = trans.get(file_id.encode("ascii"))
+        with self.db.begin(buffers=True) as trans:
+            data = trans.get(file_id.encode("utf-8"))
+        stream: io.BytesIO = io.BytesIO(data)
+        if mode == "r":
+            reader: io.TextIOWrapper = io.TextIOWrapper(stream)
+        elif mode == "b":
+            reader: io.BytesIO = stream
+        else:
+            raise RuntimeError(f"Unsupported file mode: '{mode}'. Only 'r' and 'b' are supported.")
+        return reader
+    def write_file(self, file_id: str, file_data: bytes) -> None:
+        with self.db.begin(write=True) as trans:
+            trans.put(file_id.encode("utf-8"), file_data)
+    def get_all_ids(self) -> list[str]:
+        with self.db.begin() as trans:
+            cursor = trans.cursor()
+            ids: list[str] = [k for k, _ in cursor]
+        return ids
+    def close(self) -> None:
+        self.db.close()
+@click.group()
+def cli() -> None:
+    pass
+@cli.command()
+@click.option("-c", "--csv_file", required=True,
+              type=click.Path(dir_okay=False, exists=True, path_type=pathlib.Path),
+              help="Path to a CSV file containing relative paths to the dataset files.")
+@click.option("-b", "--base_dir",
+              type=click.Path(file_okay=False, exists=True, path_type=pathlib.Path),
+              help="Base directory of the dataset. Paths inside the CSV should be relative "
+                   "to that path. When not provided, the directory of the CSV file is "
+                   "considered as the base directory.")
+@click.option("-o", "--output_file", required=True,
+              type=click.Path(dir_okay=False, path_type=pathlib.Path),
+              help="Path to the database. If the file does not "
+                   "exist, a new database is generated. Otherwise, it should point to a "
+                   "previous instance of the LMDB, where data will be added.")
+def add_csv(
+    csv_file: pathlib.Path,
+    base_dir: Optional[pathlib.Path],
+    output_file: pathlib.Path
+) -> None:
+    if base_dir is None:
+        base_dir = csv_file.parent
+    db: LMDBFileStorage = LMDBFileStorage(output_file)
+    add_csv_to_db(csv_file, db, base_dir)
+    db.close()
+@cli.command()
+@click.option("-s", "--src", required=True,
+              type=click.Path(dir_okay=False, path_type=pathlib.Path, exists=True),
+              help="Database whose files will be added to the destination database.")
+@click.option("-d", "--dest", required=True,
+              type=click.Path(dir_okay=False, path_type=pathlib.Path),
+              help="Database where file from source database will be added.")
+def add_db(
+    src: pathlib.Path,
+    dest: pathlib.Path
+) -> None:
+    """Adds all the contents of a database to another."""
+    src_db: LMDBFileStorage = LMDBFileStorage(src, read_only=True)
+    dest_db: LMDBFileStorage = LMDBFileStorage(dest)
+    for k in tqdm.tqdm(src_db.get_all_ids(), desc="Copying files", unit="file"):
+        k = str(k, 'UTF-8')
+        src_file: io.BytesIO = src_db.open_file(k, mode="b")
+        dest_db.write_file(k, src_file.read())
+    src_db.close()
+    dest_db.close()
+@cli.command()
+@click.option("-c", "--csv_file", required=True,
+              type=click.Path(dir_okay=False, exists=True, path_type=pathlib.Path),
+              help="Path to a CSV file containing relative paths to the dataset files.")
+@click.option("-b", "--base_dir",
+              type=click.Path(file_okay=False, exists=True, path_type=pathlib.Path),
+              help="Base directory of the dataset. Paths inside the CSV should be relative "
+                   "to that path. When not provided, the directory of the CSV file is "
+                   "considered as the base directory.")
+@click.option("-o", "--output_file", required=True,
+              type=click.Path(dir_okay=False, path_type=pathlib.Path, exists=True),
+              help="Path to the database to verify.")
+def verify_csv(
+    csv_file: pathlib.Path,
+    base_dir: Optional[pathlib.Path],
+    output_file: pathlib.Path
+) -> None:
+    if base_dir is None:
+        base_dir = csv_file.parent
+    db: LMDBFileStorage = LMDBFileStorage(output_file, read_only=True)
+    verify_csv_in_db(csv_file, db, base_dir)
+    db.close()
+@cli.command()
+@click.option("-d", "--database", required=True,
+              type=click.Path(dir_okay=False, path_type=pathlib.Path, exists=True),
+              help="Database whose keys will be printed.")
+@click.option("-h", "--hierarchical", is_flag=True,
+              help="List files in DB according to directories hierarchy.")
+def list_db(
+    database: pathlib.Path,
+    hierarchical: bool
+) -> None:
+    """Lists the contents of a file storage."""
+    db: LMDBFileStorage = LMDBFileStorage(database, read_only=True)
+    if not hierarchical:
+        for k in db.get_all_ids():
+            print(k)
+    else:
+        # The db contains filenames as keys, so their parents will always be the dir names.
+        ids: list[str] = [str(pathlib.Path(str(k, 'UTF-8')).parent) for k in db.get_all_ids()]
+        counts: Counter = Counter(ids)
+        dir_graph: nx.DiGraph = nx.DiGraph()
+        for k in counts.keys():
+            dir_graph.add_edge(str(pathlib.Path(k).parent), k)
+            dir_graph.nodes[k]["items_num"] = counts[k]
+        top_level_nodes: list[str] = [n for n in dir_graph.nodes if dir_graph.in_degree(n) == 0]
+        top_level_nodes = sorted(top_level_nodes)
+        for n in top_level_nodes:
+            print_dirs_from_graph(dir_graph, n)
+def add_csv_to_db(
+    csv_file: pathlib.Path,
+    db: LMDBFileStorage,
+    base_dir: pathlib.Path,
+    key_base_dir: Optional[pathlib.Path] = None,
+    verbose: bool = True
+) -> int:
+    """Adds the contents of the file paths included in a CSV file into an LMDB File Storage.
+    Paths of the files, relative to the base dir, are utilized as keys into the storage.
+    Thus, the maximum allowed path length is 511 bytes.
+    The contents of nested CSV files are recursively added into the LMDB File Storage.
+    In that case, keys represent the file structure relative to the base dir.
+    :param csv_file: Path to a CSV file describing a dataset.
+    :param db: An instance of LMDB File Storage, where files will be added.
+    :param base_dir: Directory where paths included into the CSV file are relative to.
+    :param key_base_dir: Directory where paths encoded into the keys of the LMDB File
+        Storage will be relative to. It should be either the same or an upper directory
+        compared to base dir. When this argument is omitted, the value of base dir
+        is used.
+    :param verbose: When set to False, progress messages will not be printed.
+    """
+    entries: list[dict[str, str]] = read_csv_file(csv_file, verbose=verbose)
+    if key_base_dir is None:
+        key_base_dir = base_dir
+    if verbose:
+        pbar = tqdm.tqdm(entries, desc="Writing CSV data to database", unit="file")
+    else:
+        pbar = entries
+    files_written: int = 0
+    for e in pbar:
+        # Generate key-path pairs for each path in the CSV.
+        files_to_write: dict[str, pathlib.Path] = find_files(
+            list(e.values()),
+            base_dir,
+            key_base_dir
+        )
+        files_written += write_files_to_db(files_to_write, db)
+        # Recursively add the contents of the encountered CSV files.
+        for p in files_to_write.values():
+            if p.suffix == ".csv":
+                files_written += add_csv_to_db(
+                    p, db, p.parent, key_base_dir=key_base_dir, verbose=False
+                )
+        if verbose:
+            pbar.set_postfix({"Files Written": files_written})
+    return files_written
+def verify_csv_in_db(
+    csv_file: pathlib.Path,
+    db: LMDBFileStorage,
+    base_dir: pathlib.Path,
+    key_base_dir: Optional[pathlib.Path] = None,
+    verbose: bool = True
+) -> int:
+    entries: list[dict[str, str]] = read_csv_file(csv_file, verbose=verbose)
+    if key_base_dir is None:
+        key_base_dir = base_dir
+    if verbose:
+        pbar = tqdm.tqdm(entries, desc="Verifying CSV data in database", unit="file")
+    else:
+        pbar = entries
+    files_verified: int = 0
+    for e in pbar:
+        # Generate key-path pairs for each path in the CSV.
+        files: dict[str, pathlib.Path] = find_files(
+            list(e.values()),
+            base_dir,
+            key_base_dir
+        )
+        files_verified += verify_files_in_db(files, db)
+        # Recursively verify the contents of the encountered CSV files.
+        for p in files.values():
+            if p.suffix == ".csv":
+                files_verified += verify_csv_in_db(
+                    p, db, p.parent, key_base_dir=key_base_dir, verbose=False
+                )
+        if verbose:
+            pbar.set_postfix({"Files Verified": files_verified})
+    return files_verified
+def find_files(
+    candidates: list[str],
+    base_dir: pathlib.Path,
+    key_base_dir: pathlib.Path
+) -> dict[str, pathlib.Path]:
+    files: dict[str, pathlib.Path] = {}
+    for c in candidates:
+        p: pathlib.Path = base_dir / c
+        key: str = str(p.relative_to(key_base_dir))
+        if p.exists() and p.is_file():
+            files[key] = p
+    return files
+def write_files_to_db(files: dict[str, pathlib.Path], db: LMDBFileStorage) -> int:
+    for k, p in files.items():
+        data: bytes = read_raw_file(p)
+        db.write_file(k, data)
+    return len(files)
+def verify_files_in_db(files: dict[str, pathlib.Path], db: LMDBFileStorage) -> int:
+    verified: int = 0
+    for k, p in files.items():
+        # Calculate md5 hash of the file in csv.
+        with p.open("rb") as f:
+            csv_file_hash: str = md5(f)
+        # Calculate md5 hash of the file in db.
+        db_file: io.BytesIO = db.open_file(k, mode="b")
+        db_file_hash: str = md5(db_file)
+        if csv_file_hash == db_file_hash:
+            verified += 1
+        else:
+            logging.error(f"File in DB not matching file in CSV: {str(p)}")
+    return verified
+def read_csv_file(csv_file: pathlib.Path, verbose: bool = True) -> list[dict[str, str]]:
+    # Read the whole csv file.
+    if verbose:
+        logging.info(f"READING CSV: {str(csv_file)}")
+    entries: list[dict[str, str]] = []
+    with csv_file.open() as f:
+        reader: csv.DictReader = csv.DictReader(f, delimiter=",")
+        if verbose:
+            pbar = tqdm.tqdm(reader, desc="Reading CSV entries", unit="entry")
+        else:
+            pbar = reader
+        for row in pbar:
+            entries.append(row)
+    if verbose:
+        logging.info(f"TOTAL ENTRIES: {len(entries)}")
+    return entries
+def read_raw_file(p: pathlib.Path) -> bytes:
+    with p.open("rb") as f:
+        data: bytes = f.read()
+    return data
+def md5(stream) -> str:
+    """Calculates md5 hash of a file-like stream."""
+    hash_md5 = hashlib.md5()
+    for chunk in iter(lambda: stream.read(4096), b""):
+        hash_md5.update(chunk)
+    return hash_md5.hexdigest()
+def print_dirs_from_graph(g: nx.DiGraph, n: str, depth: int = 0) -> None:
+    if depth > 0:
+        init_text: str = "    " * (depth - 1) + "|.. "
+    else:
+        init_text: str = ""
+    text: str = f"{init_text}{n}"
+    print(text)
+    for s in sorted(g.successors(n)):
+        print_dirs_from_graph(g, s, depth+1)
+    if "items_num" in g.nodes[n]:
+        init_text = "    " * (depth+1)
+        text = f"{init_text}({g.nodes[n]['items_num']} files)"
+        print(text)
+if __name__ == "__main__":
+    logging.getLogger().setLevel(logging.INFO)
+    cli()

spai/data/random_degradations.py ADDED Viewed

	@@ -0,0 +1,462 @@

+# Copyright (c) OpenMMLab. All rights reserved.
+import random
+import cv2
+import numpy as np
+import torch
+from . import blur_kernels as blur_kernels
+class RandomBlur:
+    """Apply random blur to the input.
+    Modified keys are the attributed specified in "keys".
+    Args:
+        params (dict): A dictionary specifying the degradation settings.
+        keys (list[str]): A list specifying the keys whose values are
+            modified.
+    """
+    def __init__(self, params):
+        self.params = params
+    def get_kernel(self, num_kernels):
+        kernel_type = random.choices(
+            self.params['kernel_list'], weights=self.params['kernel_prob'])[0]
+        kernel_size = random.choice(self.params['kernel_size'])
+        sigma_x_range = self.params.get('sigma_x', [0, 0])
+        sigma_x = random.uniform(sigma_x_range[0], sigma_x_range[1])
+        sigma_x_step = self.params.get('sigma_x_step', 0)
+        sigma_y_range = self.params.get('sigma_y', [0, 0])
+        sigma_y = random.uniform(sigma_y_range[0], sigma_y_range[1])
+        sigma_y_step = self.params.get('sigma_y_step', 0)
+        rotate_angle_range = self.params.get('rotate_angle', [-np.pi, np.pi])
+        rotate_angle = random.uniform(rotate_angle_range[0],
+                                         rotate_angle_range[1])
+        rotate_angle_step = self.params.get('rotate_angle_step', 0)
+        beta_gau_range = self.params.get('beta_gaussian', [0.5, 4])
+        beta_gau = random.uniform(beta_gau_range[0], beta_gau_range[1])
+        beta_gau_step = self.params.get('beta_gaussian_step', 0)
+        beta_pla_range = self.params.get('beta_plateau', [1, 2])
+        beta_pla = random.uniform(beta_pla_range[0], beta_pla_range[1])
+        beta_pla_step = self.params.get('beta_plateau_step', 0)
+        omega_range = self.params.get('omega', None)
+        omega_step = self.params.get('omega_step', 0)
+        if omega_range is None:  # follow Real-ESRGAN settings if not specified
+            if kernel_size < 13:
+                omega_range = [np.pi / 3., np.pi]
+            else:
+                omega_range = [np.pi / 5., np.pi]
+        omega = random.uniform(omega_range[0], omega_range[1])
+        # determine blurring kernel
+        kernels = []
+        for _ in range(0, num_kernels):
+            kernel = blur_kernels.random_mixed_kernels(
+                [kernel_type],
+                [1],
+                kernel_size,
+                [sigma_x, sigma_x],
+                [sigma_y, sigma_y],
+                [rotate_angle, rotate_angle],
+                [beta_gau, beta_gau],
+                [beta_pla, beta_pla],
+                [omega, omega],
+                None,
+            )
+            kernels.append(kernel)
+            # update kernel parameters
+            sigma_x += random.uniform(-sigma_x_step, sigma_x_step)
+            sigma_y += random.uniform(-sigma_y_step, sigma_y_step)
+            rotate_angle += random.uniform(-rotate_angle_step,
+                                              rotate_angle_step)
+            beta_gau += random.uniform(-beta_gau_step, beta_gau_step)
+            beta_pla += random.uniform(-beta_pla_step, beta_pla_step)
+            omega += random.uniform(-omega_step, omega_step)
+            sigma_x = np.clip(sigma_x, sigma_x_range[0], sigma_x_range[1])
+            sigma_y = np.clip(sigma_y, sigma_y_range[0], sigma_y_range[1])
+            rotate_angle = np.clip(rotate_angle, rotate_angle_range[0],
+                                   rotate_angle_range[1])
+            beta_gau = np.clip(beta_gau, beta_gau_range[0], beta_gau_range[1])
+            beta_pla = np.clip(beta_pla, beta_pla_range[0], beta_pla_range[1])
+            omega = np.clip(omega, omega_range[0], omega_range[1])
+        return kernels
+    def _apply_random_blur(self, imgs):
+        is_single_image = False
+        if isinstance(imgs, np.ndarray):
+            is_single_image = True
+            imgs = [imgs]
+        # get kernel and blur the input
+        kernels = self.get_kernel(num_kernels=len(imgs))
+        imgs = [
+            cv2.filter2D(img, -1, kernel)
+            for img, kernel in zip(imgs, kernels)
+        ]
+        if is_single_image:
+            imgs = imgs[0]
+        return imgs
+    def __call__(self, results):
+        if random.random() > self.params.get('prob', 1):
+            return results
+        results = self._apply_random_blur(results)
+        return results
+    def __repr__(self):
+        repr_str = self.__class__.__name__
+        repr_str += (f'(params={self.params})')
+        return repr_str
+class RandomResize:
+    """Randomly resize the input.
+    Modified keys are the attributed specified in "keys".
+    Args:
+        params (dict): A dictionary specifying the degradation settings.
+        keys (list[str]): A list specifying the keys whose values are
+            modified.
+    """
+    def __init__(self, params):
+        self.params = params
+        self.resize_dict = dict(
+            bilinear=cv2.INTER_LINEAR,
+            bicubic=cv2.INTER_CUBIC,
+            area=cv2.INTER_AREA,
+            lanczos=cv2.INTER_LANCZOS4)
+    def _random_resize(self, imgs):
+        is_single_image = False
+        if isinstance(imgs, np.ndarray):
+            is_single_image = True
+            imgs = [imgs]
+        h, w = imgs[0].shape[:2]
+        resize_opt = self.params['resize_opt']
+        resize_prob = self.params['resize_prob']
+        resize_opt = random.choices(resize_opt, weights=resize_prob)[0].lower()
+        if resize_opt not in self.resize_dict:
+            raise NotImplementedError(f'resize_opt [{resize_opt}] is not '
+                                      'implemented')
+        resize_opt = self.resize_dict[resize_opt]
+        resize_step = self.params.get('resize_step', 0)
+        # determine the target size, if not provided
+        target_size = self.params.get('target_size', None)
+        if target_size is None:
+            resize_mode = random.choices(['up', 'down', 'keep'],
+                                           weights=self.params['resize_mode_prob'])[0]
+            resize_scale = self.params['resize_scale']
+            if resize_mode == 'up':
+                scale_factor = random.uniform(1, resize_scale[1])
+            elif resize_mode == 'down':
+                scale_factor = random.uniform(resize_scale[0], 1)
+            else:
+                scale_factor = 1
+            # determine output size
+            h_out, w_out = h * scale_factor, w * scale_factor
+            if self.params.get('is_size_even', False):
+                h_out, w_out = 2 * (h_out // 2), 2 * (w_out // 2)
+            target_size = (int(h_out), int(w_out))
+        else:
+            resize_step = 0
+        # resize the input
+        if resize_step == 0:  # same target_size for all input images
+            outputs = [
+                cv2.resize(img, target_size[::-1], interpolation=resize_opt)
+                for img in imgs
+            ]
+        else:  # different target_size for each input image
+            outputs = []
+            for img in imgs:
+                img = cv2.resize(
+                    img, target_size[::-1], interpolation=resize_opt)
+                outputs.append(img)
+                # update scale
+                scale_factor += random.uniform(-resize_step, resize_step)
+                scale_factor = np.clip(scale_factor, resize_scale[0],
+                                       resize_scale[1])
+                # determine output size
+                h_out, w_out = h * scale_factor, w * scale_factor
+                if self.params.get('is_size_even', False):
+                    h_out, w_out = 2 * (h_out // 2), 2 * (w_out // 2)
+                target_size = (int(h_out), int(w_out))
+        if is_single_image:
+            outputs = outputs[0]
+        return outputs
+    def __call__(self, results):
+        if random.random() > self.params.get('prob', 1):
+            return results
+        results = self._random_resize(results)
+        return results
+    def __repr__(self):
+        repr_str = self.__class__.__name__
+        repr_str += (f'(params={self.params})')
+        return repr_str
+class RandomNoise:
+    """Apply random noise to the input.
+    Currently support Gaussian noise and Poisson noise.
+    Modified keys are the attributed specified in "keys".
+    Args:
+        params (dict): A dictionary specifying the degradation settings.
+        keys (list[str]): A list specifying the keys whose values are
+            modified.
+    """
+    def __init__(self, params):
+        self.params = params
+    def _apply_gaussian_noise(self, imgs):
+        sigma_range = self.params['gaussian_sigma']
+        sigma = random.uniform(sigma_range[0], sigma_range[1]) / 255.
+        sigma_step = self.params.get('gaussian_sigma_step', 0)
+        gray_noise_prob = self.params['gaussian_gray_noise_prob']
+        is_gray_noise = random.random() < gray_noise_prob
+        outputs = []
+        for img in imgs:
+            noise = torch.randn(*(img.shape)).numpy() * sigma
+            if is_gray_noise:
+                noise = noise[:, :, :1]
+            outputs.append(img + noise)
+            # update noise level
+            sigma += random.uniform(-sigma_step, sigma_step) / 255.
+            sigma = np.clip(sigma, sigma_range[0] / 255.,
+                            sigma_range[1] / 255.)
+        return outputs
+    def _apply_poisson_noise(self, imgs):
+        scale_range = self.params['poisson_scale']
+        scale = random.uniform(scale_range[0], scale_range[1])
+        scale_step = self.params.get('poisson_scale_step', 0)
+        gray_noise_prob = self.params['poisson_gray_noise_prob']
+        is_gray_noise = random.random() < gray_noise_prob
+        outputs = []
+        for img in imgs:
+            noise = img.copy()
+            if is_gray_noise:
+                noise = cv2.cvtColor(noise[..., [2, 1, 0]], cv2.COLOR_BGR2GRAY)
+                noise = noise[..., np.newaxis]
+            noise = np.clip((noise * 255.0).round(), 0, 255) / 255.
+            unique_val = 2**np.ceil(np.log2(len(np.unique(noise))))
+            noise = torch.poisson(torch.from_numpy(noise * unique_val)).numpy() / unique_val - noise
+            outputs.append(img + noise * scale)
+            # update noise level
+            scale += random.uniform(-scale_step, scale_step)
+            scale = np.clip(scale, scale_range[0], scale_range[1])
+        return outputs
+    def _apply_random_noise(self, imgs):
+        noise_type = random.choices(
+            self.params['noise_type'], weights=self.params['noise_prob'])[0]
+        is_single_image = False
+        if isinstance(imgs, np.ndarray):
+            is_single_image = True
+            imgs = [imgs]
+        if noise_type.lower() == 'gaussian':
+            imgs = self._apply_gaussian_noise(imgs)
+        elif noise_type.lower() == 'poisson':
+            imgs = self._apply_poisson_noise(imgs)
+        else:
+            raise NotImplementedError(f'"noise_type" [{noise_type}] is '
+                                      'not implemented.')
+        if is_single_image:
+            imgs = imgs[0]
+        return imgs
+    def __call__(self, results):
+        if random.random() > self.params.get('prob', 1):
+            return results
+        results = self._apply_random_noise(results)
+        return results
+    def __repr__(self):
+        repr_str = self.__class__.__name__
+        repr_str += (f'(params={self.params})')
+        return repr_str
+class RandomJPEGCompression:
+    """Apply random JPEG compression to the input.
+    Modified keys are the attributed specified in "keys".
+    Args:
+        params (dict): A dictionary specifying the degradation settings.
+        keys (list[str]): A list specifying the keys whose values are
+            modified.
+    """
+    def __init__(self, params):
+        self.params = params
+    def _apply_random_compression(self, imgs):
+        is_single_image = False
+        if isinstance(imgs, np.ndarray):
+            is_single_image = True
+            imgs = [imgs]
+        # determine initial compression level and the step size
+        quality = self.params['quality']
+        quality_step = self.params.get('quality_step', 0)
+        jpeg_param = round(random.uniform(quality[0], quality[1]))
+        # apply jpeg compression
+        outputs = []
+        for img in imgs:
+            encode_param = [int(cv2.IMWRITE_JPEG_QUALITY), jpeg_param]
+            _, img_encoded = cv2.imencode('.jpg', img * 255., encode_param)
+            outputs.append(np.float32(cv2.imdecode(img_encoded, 1)) / 255.)
+            # update compression level
+            jpeg_param += random.uniform(-quality_step, quality_step)
+            jpeg_param = round(np.clip(jpeg_param, quality[0], quality[1]))
+        if is_single_image:
+            outputs = outputs[0]
+        return outputs
+    def __call__(self, results):
+        if random.random() > self.params.get('prob', 1):
+            return results
+        results = self._apply_random_compression(results)
+        return results
+    def __repr__(self):
+        repr_str = self.__class__.__name__
+        repr_str += (f'(params={self.params})')
+        return repr_str
+allowed_degradations = {
+    'RandomBlur': RandomBlur,
+    'RandomResize': RandomResize,
+    'RandomNoise': RandomNoise,
+    'RandomJPEGCompression': RandomJPEGCompression,
+}
+class DegradationsWithShuffle:
+    """Apply random degradations to input, with degradations being shuffled.
+    Degradation groups are supported. The order of degradations within the same
+    group is preserved. For example, if we have degradations = [a, b, [c, d]]
+    and shuffle_idx = None, then the possible orders are
+    ::
+        [a, b, [c, d]]
+        [a, [c, d], b]
+        [b, a, [c, d]]
+        [b, [c, d], a]
+        [[c, d], a, b]
+        [[c, d], b, a]
+    Modified keys are the attributed specified in "keys".
+    Args:
+        degradations (list[dict]): The list of degradations.
+        keys (list[str]): A list specifying the keys whose values are
+            modified.
+        shuffle_idx (list | None, optional): The degradations corresponding to
+            these indices are shuffled. If None, all degradations are shuffled.
+    """
+    def __init__(self, degradations, shuffle_idx=None):
+        self.degradations = self._build_degradations(degradations)
+        if shuffle_idx is None:
+            self.shuffle_idx = list(range(0, len(degradations)))
+        else:
+            self.shuffle_idx = shuffle_idx
+    def _build_degradations(self, degradations):
+        for i, degradation in enumerate(degradations):
+            if isinstance(degradation, (list, tuple)):
+                degradations[i] = self._build_degradations(degradation)
+            else:
+                degradation_ = allowed_degradations[degradation['type']]
+                degradations[i] = degradation_(degradation['params'])
+        return degradations
+    def __call__(self, results):
+        # shuffle degradations
+        if len(self.shuffle_idx) > 0:
+            shuffle_list = [self.degradations[i] for i in self.shuffle_idx]
+            random.shuffle(shuffle_list)
+            for i, idx in enumerate(self.shuffle_idx):
+                self.degradations[idx] = shuffle_list[i]
+        # apply degradations to input
+        for degradation in self.degradations:
+            if isinstance(degradation, (tuple, list)):
+                for subdegrdation in degradation:
+                    results = subdegrdation(results)
+            else:
+                results = degradation(results)
+        return results
+    def __repr__(self):
+        repr_str = self.__class__.__name__
+        repr_str += (f'(degradations={self.degradations}, '
+                     f'shuffle_idx={self.shuffle_idx})')
+        return repr_str

spai/data/readers.py ADDED Viewed

	@@ -0,0 +1,178 @@

+# SPDX-FileCopyrightText: Copyright (c) 2025 Centre for Research and Technology Hellas
+# and University of Amsterdam. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import csv
+import io
+import pathlib
+from typing import Any, Union, Optional
+import numpy as np
+import torch
+from PIL import Image
+from torchvision.io import read_image
+from spai.data import filestorage
+class DataReader:
+    def read_csv_file(self, path: str) -> list[dict[str, Any]]:
+        raise NotImplementedError
+    def load_image(self, path: str, channels: int) -> Image.Image:
+        raise NotImplementedError
+    def get_image_size(self, path: str) -> tuple[int, int]:
+        """Returns the size of an image as a (width, height) tuple."""
+        raise NotImplementedError
+    def load_signals_from_csv(
+        self,
+        csv_path: str,
+        column_name: str = "seg_map",
+        channels: int = 1,
+        data_specifier: Optional[dict[str, str]] = None
+    ) -> list[np.ndarray]:
+        """Loads all the signals specified in a column of a CSV file.
+        The default values for the column name and the number of channels have
+        been specified for the CSV containing the instance segmentation maps of
+        an image."""
+        raise NotImplementedError
+    def load_file_path_or_stream(self, path: str) -> Union[pathlib.Path, io.FileIO, io.BytesIO]:
+        raise NotImplementedError
+class FileSystemReader(DataReader):
+    """Reader that maps relative paths to absolute paths of the filesystem."""
+    def __init__(self, root_path: pathlib.Path):
+        super().__init__()
+        self.root_path: pathlib.Path = root_path
+    def read_csv_file(self, path: str) -> list[dict[str, Any]]:
+        with (self.root_path/path).open("r") as f:
+            reader = csv.DictReader(f, delimiter=",")
+            contents: list[dict[str, Any]] = [row for row in reader]
+        return contents
+    def get_image_size(self, path: str) -> tuple[int, int]:
+        with Image.open(self.root_path/path) as image:
+            image_size: tuple[int, int] = image.size
+        return image_size
+    def load_image(self, path: str, channels: int) -> Image.Image:
+        try:
+            image = Image.open(self.root_path/path)
+            if channels == 1:
+                image = image.convert("L")
+            else:
+                image = image.convert("RGB")
+        except Exception as e:
+            print(f"Failed to read: {path}")
+            raise e
+        # image = np.array(image)
+        #
+        # if len(image.shape) == 2:
+        #     image = np.expand_dims(image, axis=2)
+        return image
+    def load_signals_from_csv(
+        self,
+        csv_path: str,
+        column_name: str = "seg_map",
+        channels: int = 1,
+        data_specifier: Optional[dict[str, str]] = None
+    ) -> list[np.ndarray]:
+        csv_data: list[dict[str, Any]] = self.read_csv_file(csv_path)
+        signals: list[np.ndarray] = []
+        for row in csv_data:
+            # Ignore entries that do not match with the given data specifier.
+            if data_specifier is not None and not data_specifier_matches_entry(row, data_specifier):
+                continue
+            signal_path: pathlib.Path = (self.root_path / csv_path).parent / row[column_name]
+            signal: np.ndarray = self.load_image(str(signal_path.relative_to(self.root_path)),
+                                                 channels=channels)
+            signals.append(signal)
+        return signals
+    def load_file_path_or_stream(self, path: str) -> Union[pathlib.Path, io.FileIO, io.BytesIO]:
+        return self.root_path / path
+class LMDBFileStorageReader(DataReader):
+    """Reader that maps relative paths into an LMDBFileStorage."""
+    def __init__(self, storage: filestorage.LMDBFileStorage):
+        super().__init__()
+        self.storage: filestorage.LMDBFileStorage = storage
+    def read_csv_file(self, path: str) -> list[dict[str, Any]]:
+        stream = self.storage.open_file(path)
+        reader = csv.DictReader(stream, delimiter=",")
+        contents: list[dict[str, Any]] = [row for row in reader]
+        return contents
+    def get_image_size(self, path: str) -> tuple[int, int]:
+        stream = self.storage.open_file(path, mode="b")
+        with Image.open(stream) as image:
+            image_size: tuple[int, int] = image.size
+        return image_size
+    def load_image(self, path: str, channels: int) -> Image.Image:
+        stream = self.storage.open_file(path, mode="b")
+        with Image.open(stream) as image:
+            if channels == 1:
+                image = image.convert("L")
+            else:
+                image = image.convert("RGB")
+            # image = np.array(image)
+        stream.close()
+        # if len(image.shape) == 2:
+        #     image = np.expand_dims(image, axis=2)
+        return image
+    def load_signals_from_csv(
+        self,
+        csv_path: str,
+        column_name: str = "seg_map",
+        channels: int = 1,
+        data_specifier: Optional[dict[str, str]] = None
+    ) -> list[np.ndarray]:
+        csv_data: list[dict[str, Any]] = self.read_csv_file(csv_path)
+        signals: list[np.ndarray] = []
+        for row in csv_data:
+            # Ignore entries that do not match with the given data specifier.
+            if data_specifier is not None and not data_specifier_matches_entry(row, data_specifier):
+                continue
+            signal_path: str = str(pathlib.Path(csv_path).parent / row[column_name])
+            signal: np.ndarray = self.load_image(signal_path, channels=channels)
+            signals.append(signal)
+        return signals
+def data_specifier_matches_entry(entry: dict[str, str], specifier: dict[str, str]) -> bool:
+    """Checks whether a CSV entry matches a data specifier."""
+    for k, v in specifier.items():
+        if k not in entry or entry[k] != v:
+            return False
+    return True

spai/data_utils.py ADDED Viewed

	@@ -0,0 +1,50 @@

+# SPDX-FileCopyrightText: Copyright (c) 2025 Centre for Research and Technology Hellas
+# and University of Amsterdam. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import pathlib
+import csv
+import hashlib
+from typing import Any, Optional
+def read_csv_file(
+    path: pathlib.Path,
+    delimiter: str = ","
+) -> list[dict[str, Any]]:
+    with path.open("r") as f:
+        reader = csv.DictReader(f, delimiter=delimiter)
+        contents: list[dict[str, Any]] = [row for row in reader]
+    return contents
+def write_csv_file(
+    data: list[dict[str, Any]],
+    output_file: pathlib.Path,
+    fieldnames: Optional[list[str]] = None,
+    delimiter: str = "|"
+) -> None:
+    if fieldnames is None:
+        fieldnames = list(data[0].keys())
+    with output_file.open("w", newline="") as f:
+        writer: csv.DictWriter = csv.DictWriter(f, fieldnames=fieldnames, delimiter=delimiter)
+        writer.writeheader()
+        for r in data:
+            writer.writerow(r)
+def compute_file_md5(path: pathlib.Path) -> str:
+    with path.open("rb") as f:
+        return hashlib.md5(f.read()).hexdigest()