PicoTrust β€” Robust Image Steganography

PicoTrust is a neural image steganography model that encodes hidden binary messages into images, surviving real-world distortions like JPEG compression, blur, noise, and color shifts.

Based on StegaStamp (Tancik et al., CVPR 2020) with significant architectural improvements.

Architecture

  • Encoder: U-Net (512x512) + E_post refinement (1-channel grayscale residual, softsign bounding with strength annealing)
  • Decoder: StegaStamp-style CNN (256x256) with compact STN for geometric alignment
  • Discriminator: WGAN PatchGAN
  • Message: 100 bits per image
  • Bounding: Softsign residual with strength annealing (1.0 β†’ target over training)

Checkpoints

Model File Strength PSNR Bit Accuracy JPEG Q10
v4 v4/picotrust_v4_200k.pt 0.02 35.56 dB 97.8% 97.6%
v2 v2/picotrust_v2_200k.pt 0.03 32.82 dB 98.4% 98.6%
  • v4: Best balance of visual quality and accuracy. +2.7 dB PSNR over v2 with <1% accuracy trade-off.
  • v2: Highest raw accuracy and JPEG robustness. Zero colour shifts (architectural guarantee).

Both models have zero colour shifts (grayscale residual) and >92% accuracy across all distortion types.

Robustness (v4 @ 200k steps)

Distortion Bit Accuracy
Clean 97.8%
JPEG Q10 97.6%
Gaussian blur (Οƒ=3) 98.0%
Gaussian noise (Οƒ=0.05) 96.6%
Brightness Β±0.3 94.6%
Contrast Β±0.3 98.2%

Usage

import torch
from picode.models.picotrust import Encoder, Decoder

# Load checkpoint
ckpt = torch.load("picotrust_v4_200k.pt", map_location="cpu")

# Reconstruct encoder/decoder
encoder = Encoder(message_length=100, image_size=512)
decoder = Decoder(message_length=100, image_size=256)

encoder.load_state_dict(ckpt["encoder"])
decoder.load_state_dict(ckpt["decoder"])

# Encode: image (1,3,512,512) [0,1] + message (1,100) binary
encoded = encoder(image, message)

# Decode: returns probabilities (1,100) in [0,1]
decoded = decoder(encoded)
bits = (decoded > 0.5).float()

Training

Trained on COCO train2017 (~118K images) for 200K steps on a single NVIDIA T4 GPU.

Config: configs/picotrust_v4.yaml (v4) / configs/picotrust_v2.yaml (v2)

Key Design Decisions

  1. Grayscale residual: 1-channel E_post output broadcast to 3 channels β€” eliminates colour shifts by construction
  2. Softsign bounding: strength * x / (1 + |x|) β€” non-vanishing gradients unlike tanh
  3. Strength annealing: Start unbounded (1.0) β†’ anneal to target β€” prevents bootstrap collapse
  4. MSE message loss: No trivial equilibrium unlike BCE
  5. Zero-init E_post: Residual starts at exactly zero, grows gradually

License

Apache 2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support