Krypsis-COD πŸ™ β€” underwater camouflaged-animal segmentation, tiny

A compact (4.8 M params, 4.6 MB int8) segmenter for camouflaged marine animals. The frontier for this task is giant SAM/SAM2 adapters (Dual-SAM, MAS-SAM, SAM2-WaveUNet) at hundreds of MB; Krypsis-COD is ~130Γ— smaller yet built the right way β€” a pretrained backbone with the boundary + frequency supervision the COD literature shows actually matters.

Segmentation demo

Held-out COD10K-Aquatic frames (flounder, frogfish, stingray, turtle β€” not in training): raw input β†’ Krypsis-COD segmentation.

Parameters 4,770,950 (~4.8 M)
Size on disk 19.2 MB fp32 Β· 9.7 MB fp16 Β· 4.6 MB int8
Backbone pvt_v2_b0 (ImageNet-pretrained)
Input 3Γ—352Γ—352 RGB
Output camouflaged-animal mask + edge map
Prior stem AquaWave β€” 0 learnable params (UDCP + WB-residual + Haar high-freq)
CPU latency 270 ms/image (single thread)

Method (what the literature says actually works)

  • Pretrained PVTv2 backbone. Camouflage is a global-context problem; transformer features + ImageNet pretraining are what every competitive COD model relies on.
  • AquaWave prior stem (0 params) feeds physics + frequency cues straight into the backbone: an Underwater Dark Channel Prior transmission map, a grey-world white-balance colour residual, and a Haar high-frequency energy map β€” light attenuation leaks in colour, camouflage leaks in frequency.
  • Edge / boundary co-supervision β€” the most-cited COD lever; an edge head is supervised by the mask boundary and fused into the mask head.
  • Deep multi-level supervision over the FPN decoder.
  • Trained at 352Β² on ~7,000 images (CAMO + COD10K corpus), mixed precision, discriminative LR (low for the pretrained backbone, high for the new heads).

See PAPER.md and RESEARCH_DOSSIER.md for the full method and reading list.

Results

CAMO test (standard COD benchmark):

metric Krypsis-COD
S-measure ↑ 0.748
weighted-F ↑ 0.643
E-measure ↑ 0.793
MAE ↓ 0.114
IoU ↑ 0.563
Dice ↑ 0.678

Held-out (COD10K-heavy, unseen): S-measure 0.864 Β· MAE 0.037.

This is not a SOTA-accuracy record β€” the SAM-scale giants score higher. It is a strong accuracy-per-byte point: competitive structure metrics at ~130Γ— less size.

Usage

import json, torch, numpy as np
from PIL import Image
from krypsis.cod import KrypsisCOD

cfg = json.load(open("config.json"))
m = KrypsisCOD(backbone=cfg["backbone"].split()[0], pretrained=False)
m.load_state_dict(torch.load("pytorch_model.bin", map_location="cpu")); m.eval()

img = np.asarray(Image.open("reef.jpg").convert("RGB").resize((352, 352)), np.float32) / 255.
x = torch.from_numpy(img).permute(2, 0, 1)[None]
with torch.no_grad():
    mask = torch.sigmoid(m(x, want_aux=False)["mask"])[0, 0] > 0.5

Training & data

  • Data: Umair2002/COD_CAMO_train_data (~7,000 paired camouflage images+masks) for training; PassbyGrocer/CAMO test for the benchmark.
  • Compute: single Modal A10G GPU, 30 epochs.
  • Loss: boundary-weighted structure loss + Dice + deep supervision + edge BCE.

Limitations

Camouflage is hard; this tiny model trails the giant SAM adapters on raw accuracy. Held-out IoU on small COD10K objects is lower than its structure scores suggest. Not a substitute for expert identification.

Citation

@software{krypsis_cod_2026,
  title  = {Krypsis-COD: tiny underwater camouflaged-animal segmentation with a
            pretrained backbone and a zero-parameter physics-frequency prior},
  year   = {2026},
  url    = {https://huggingface.co/ryanrana/krypsis-nano}
}
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Datasets used to train ryanrana/krypsis-nano