File size: 8,032 Bytes
3e6437a
f2f99a3
 
 
 
3e6437a
f2f99a3
3e6437a
 
 
f2f99a3
 
53bf5b7
 
 
f2f99a3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53bf5b7
 
 
 
f2f99a3
 
53bf5b7
f2f99a3
 
 
 
 
 
 
53bf5b7
f2f99a3
 
53bf5b7
 
f2f99a3
53bf5b7
 
f2f99a3
53bf5b7
f2f99a3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53bf5b7
 
 
 
f2f99a3
 
 
 
 
53bf5b7
 
f2f99a3
 
 
 
 
53bf5b7
f2f99a3
53bf5b7
 
f2f99a3
 
53bf5b7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f2f99a3
 
53bf5b7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f2f99a3
 
 
 
 
 
 
 
 
 
 
53bf5b7
f2f99a3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53bf5b7
 
 
 
 
 
 
 
 
f2f99a3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53bf5b7
f2f99a3
 
 
53bf5b7
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
---
title: LTMarX
emoji: 🎬
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
---

# LTMarX β€” Video Watermarking

Imperceptible 32-bit watermarking for video. Embeds a payload into the luminance channel using DWT/DCT transform-domain quantization (DM-QIM) with BCH error correction.

Survives re-encoding, rescaling, brightness/contrast/saturation adjustments, and cropping up to ~20%.

All processing runs in the browser β€” no server round-trips needed.

## Quick Start

```bash
npm install
npm run dev      # Web UI at localhost:5173
npm test         # Run test suite
```

## CLI

```bash
npx tsx server/cli.ts embed -i input.mp4 -o output.mp4 --key SECRET --preset moderate --payload DEADBEEF
npx tsx server/cli.ts detect -i output.mp4 --key SECRET
npx tsx server/cli.ts presets
```

## Docker

```bash
docker build -t ltmarx .
docker run -p 7860:7860 ltmarx
```

## Architecture

```
core/           Pure TypeScript watermark engine (isomorphic, zero platform deps)
β”œβ”€β”€ dwt.ts          Haar DWT (forward/inverse, multi-level)
β”œβ”€β”€ dct.ts          8Γ—8 DCT with zigzag scan
β”œβ”€β”€ dmqim.ts        Dither-Modulated QIM (embed/extract with soft decisions)
β”œβ”€β”€ bch.ts          BCH(63,36,5) over GF(2^6), Berlekamp-Massey decoding
β”œβ”€β”€ crc.ts          CRC-4 integrity check
β”œβ”€β”€ tiling.ts       Periodic tile layout + autocorrelation-based grid recovery
β”œβ”€β”€ masking.ts      Perceptual masking (variance-adaptive quantization step)
β”œβ”€β”€ keygen.ts       Seeded PRNG for dithers and permutations
β”œβ”€β”€ embedder.ts     Y-plane β†’ watermarked Y-plane
β”œβ”€β”€ detector.ts     Y-plane(s) β†’ payload + confidence
β”œβ”€β”€ presets.ts      Named configurations (light β†’ fortress)
└── types.ts        Shared types

web/            Frontend (Vite + React + Tailwind)
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ App.tsx
β”‚   β”œβ”€β”€ components/
β”‚   β”‚   β”œβ”€β”€ EmbedPanel.tsx          Upload, configure, embed, download
β”‚   β”‚   β”œβ”€β”€ DetectPanel.tsx         Upload, detect, display results
β”‚   β”‚   β”œβ”€β”€ ComparisonView.tsx      Side-by-side / difference viewer
β”‚   β”‚   β”œβ”€β”€ RobustnessTest.tsx      Automated attack battery (re-encode, crop, etc.)
β”‚   β”‚   β”œβ”€β”€ HowItWorks.tsx          Interactive explainer with D3 visualizations
β”‚   β”‚   β”œβ”€β”€ StrengthSlider.tsx      Preset selector with snap points
β”‚   β”‚   β”œβ”€β”€ ResultCard.tsx          Detection result display
β”‚   β”‚   └── ApiDocs.tsx             Inline API reference
β”‚   β”œβ”€β”€ lib/
β”‚   β”‚   └── video-io.ts            Frame extraction, encoding, attack simulations
β”‚   └── workers/
β”‚       └── watermark.worker.ts
└── index.html

server/         Node.js CLI + HTTP API
β”œβ”€β”€ cli.ts          CLI for embed/detect
β”œβ”€β”€ api.ts          HTTP server (serves web UI + REST endpoints)
└── ffmpeg-io.ts    FFmpeg subprocess for YUV420p I/O

tests/          Vitest test suite
```

**Design principle:** `core/` has zero platform dependencies β€” it operates on raw `Uint8Array` Y-plane buffers. The same code runs in the browser (via Canvas + ffmpeg.wasm) and on the server (via Node.js + FFmpeg).

## Watermarking Pipeline

### Embedding

```
Y plane β†’ 2-level Haar DWT β†’ HL subband β†’ periodic tile grid β†’
  per tile: 8Γ—8 DCT blocks β†’ select mid-freq zigzag coefficients β†’
  DM-QIM embed coded bits (with per-block dithering and perceptual masking) β†’
  inverse DCT β†’ inverse DWT β†’ modified Y plane
```

### Payload Encoding

```
32-bit payload β†’ CRC-4 append β†’ BCH(63,36,5) encode β†’ keyed interleave β†’
  map to DCT coefficients across tiles (with wraparound redundancy)
```

### Detection

```
Y plane(s) β†’ DWT β†’ HL subband β†’ tile grid β†’
  per tile: DCT β†’ DM-QIM soft extract β†’
  soft-combine across tiles and frames β†’ keyed de-interleave β†’
  BCH soft decode β†’ CRC verify β†’ payload
```

### Crop-Resilient Detection

When the frame has been cropped, the detector doesn't know the original tile grid alignment. It searches over three alignment parameters:

1. **DWT padding** (0–3 per axis) β€” the crop may break DWT pixel pairing
2. **DCT block shift** (0–7 per axis) β€” the crop may misalign 8Γ—8 block boundaries within the subband
3. **Tile dither offset** (0–N per axis) β€” the crop shifts which tile-phase position each block maps to

The total search space is 16 Γ— 64 Γ— NΒ² candidates (~37K for the strong preset). To make this fast:

- DCT coefficients are precomputed once per (pad, shift) combination using only tile 0
- Dither offsets are swept cheaply using just DM-QIM re-extraction on cached coefficients
- Candidates are ranked by signal magnitude (sum of squared averaged soft bits)
- Only the top 50 candidates are fully decoded with all frames

This runs in ~1 second for 32 frames on a 512Γ—512 video.

## Presets

| Preset | Delta | Tile Period | Zigzag Positions | Masking | Use Case |
|--------|-------|-------------|------------------|---------|----------|
| **Light** | 50 | 256px | 3–14 (mid-freq) | No | Near-invisible, mild compression |
| **Moderate** | 62 | 240px | 3–14 (mid-freq) | Yes | Balanced with perceptual masking |
| **Strong** | 110 | 208px | 1–20 (low+mid) | Yes | Heavy re-encoding, rescaling, cropping |
| **Fortress** | 150 | 192px | 1–20 (low+mid) | Yes | Maximum robustness |

All presets use BCH(63,36,5) with CRC-4 and 2-level DWT.

Higher delta = stronger embedding = more visible artifacts but better survival under attacks. The "strong" and "fortress" presets use more DCT coefficients (zigzag positions 1–20 vs 3–14) for additional redundancy.

## Robustness

The web UI includes an automated robustness test battery. Each test applies an attack to the watermarked video and attempts detection:

| Attack | Variants Tested |
|--------|----------------|
| **Re-encode** | CRF 23, 28, 33, 38, 43 |
| **Downscale** | 25%, 50%, 75%, 90% |
| **Brightness** | -0.2, +0.2, +0.4 |
| **Contrast** | 0.5Γ—, 1.5Γ—, 2.0Γ— |
| **Saturation** | 0Γ—, 0.5Γ—, 2.0Γ— |
| **Crop** | 5%, 10%, 15%, 20% (per side) |

## API

### Embedding

```typescript
import { embedWatermark } from './core/embedder';
import { getPreset } from './core/presets';

const config = getPreset('moderate');
const result = embedWatermark(yPlane, width, height, payload, key, config);
// result.yPlane: watermarked Y plane (Uint8Array)
// result.psnr: quality metric (dB)
```

### Detection

```typescript
import { detectWatermarkMultiFrame } from './core/detector';
import { getPreset } from './core/presets';

const result = detectWatermarkMultiFrame(yPlanes, width, height, key, config);
// result.detected: boolean
// result.payload: Uint8Array | null
// result.confidence: 0–1
```

### Crop-Resilient Detection

```typescript
const result = detectWatermarkMultiFrame(
  yPlanes, width, height, key, config,
  { cropResilient: true }
);
```

### Auto-Detection (tries all presets)

```typescript
import { autoDetectMultiFrame } from './core/detector';

const result = autoDetectMultiFrame(yPlanes, width, height, key);
// result.presetUsed: which preset matched
```

## HTTP API

```
POST /api/embed   { videoBase64, key, preset, payload }
POST /api/detect  { videoBase64, key, preset?, frames? }
GET  /api/health  β†’ { status: "ok" }
```

## Testing

```bash
npm test              # Run all tests
npm run test:watch    # Watch mode
```

25 tests across 6 files covering: DWT round-trip, DCT round-trip, DM-QIM embed/extract, BCH encode/decode with error correction, CRC append/verify, full embed-detect pipeline across presets, false positive rejection (wrong key, unwatermarked frame), crop-resilient detection (arbitrary offset and ~20% crop).

## Browser Encoding

The web UI encodes watermarked video using ffmpeg.wasm (x264 in WebAssembly). To avoid memory pressure, frames are encoded in chunks of 100 and concatenated at the end. Peak memory stays proportional to chunk size rather than scaling with video length.