Update README.md
Browse files
README.md
CHANGED
|
@@ -9,4 +9,100 @@ tags:
|
|
| 9 |
- watermark
|
| 10 |
---
|
| 11 |
|
| 12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
- watermark
|
| 10 |
---
|
| 11 |
|
| 12 |
+
# WavMark
|
| 13 |
+
> AI-based Audio Watermarking Tool
|
| 14 |
+
|
| 15 |
+
- ⚡ **Leading Stability:** The watermark resist to **10** types of common attacks like Gaussian noise, MP3 compression, low-pass filter, and speed variation; achieving over **29** times in robustness compared with the traditional method.
|
| 16 |
+
- 🙉 **High Imperceptibility:** The watermarked audio has over 38dB SNR and 4.3 PESQ, which means it is inaudible to humans. Listen the examples: [https://wavmark.github.io/](https://wavmark.github.io/).
|
| 17 |
+
- 😉 **Easy for Extending:** This project is entirely python based. You can easily leverage our underlying PyTorch model to implement a custom watermarking system with higher capacity or robustness.
|
| 18 |
+
- 🤗 **Huggingface Spaces:** Try our online demonstration: https://huggingface.co/spaces/M4869/WavMark
|
| 19 |
+
|
| 20 |
+
## Installation
|
| 21 |
+
```
|
| 22 |
+
pip install wavmark
|
| 23 |
+
```
|
| 24 |
+
|
| 25 |
+
## Basic Usage
|
| 26 |
+
The following code adds 16-bit watermark into the input file `example.wav` and subsequently performs decoding:
|
| 27 |
+
```python
|
| 28 |
+
import numpy as np
|
| 29 |
+
import soundfile
|
| 30 |
+
import torch
|
| 31 |
+
import wavmark
|
| 32 |
+
|
| 33 |
+
|
| 34 |
+
# 1.load model
|
| 35 |
+
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
|
| 36 |
+
model = wavmark.load_model().to(device)
|
| 37 |
+
|
| 38 |
+
# 2.create 16-bit payload
|
| 39 |
+
payload = np.random.choice([0, 1], size=16)
|
| 40 |
+
print("Payload:", payload)
|
| 41 |
+
|
| 42 |
+
# 3.read host audio
|
| 43 |
+
# the audio should be a single-channel 16kHz wav, you can read it using soundfile:
|
| 44 |
+
signal, sample_rate = soundfile.read("example.wav")
|
| 45 |
+
# Otherwise, you can use the following function to convert the host audio to single-channel 16kHz format:
|
| 46 |
+
# from wavmark.utils import file_reader
|
| 47 |
+
# signal = file_reader.read_as_single_channel("example.wav", aim_sr=16000)
|
| 48 |
+
|
| 49 |
+
# 4.encode watermark
|
| 50 |
+
watermarked_signal, _ = wavmark.encode_watermark(model, signal, payload, show_progress=True)
|
| 51 |
+
# you can save it as a new wav:
|
| 52 |
+
# soundfile.write("output.wav", watermarked_signal, 16000)
|
| 53 |
+
|
| 54 |
+
# 5.decode watermark
|
| 55 |
+
payload_decoded, _ = wavmark.decode_watermark(model, watermarked_signal, show_progress=True)
|
| 56 |
+
BER = (payload != payload_decoded).mean() * 100
|
| 57 |
+
|
| 58 |
+
print("Decode BER:%.1f" % BER)
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
|
| 62 |
+
|
| 63 |
+
|
| 64 |
+
## Low-level Access
|
| 65 |
+
|
| 66 |
+
```python
|
| 67 |
+
# 1.load model
|
| 68 |
+
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
|
| 69 |
+
model = wavmark.load_model().to(device)
|
| 70 |
+
|
| 71 |
+
# 2. take 16,000 samples
|
| 72 |
+
signal, sample_rate = soundfile.read("example.wav")
|
| 73 |
+
trunck = signal[0:16000]
|
| 74 |
+
message_npy = np.random.choice([0, 1], size=32)
|
| 75 |
+
|
| 76 |
+
# 3. do encode:
|
| 77 |
+
with torch.no_grad():
|
| 78 |
+
signal = torch.FloatTensor(trunck).to(device)[None]
|
| 79 |
+
message_tensor = torch.FloatTensor(message_npy).to(device)[None]
|
| 80 |
+
signal_wmd_tensor = model.encode(signal, message_tensor)
|
| 81 |
+
signal_wmd_npy = signal_wmd_tensor.detach().cpu().numpy().squeeze()
|
| 82 |
+
|
| 83 |
+
# 4.do decode:
|
| 84 |
+
with torch.no_grad():
|
| 85 |
+
signal = torch.FloatTensor(signal_wmd_npy).to(device).unsqueeze(0)
|
| 86 |
+
message_decoded_npy = (model.decode(signal) >= 0.5).int().detach().cpu().numpy().squeeze()
|
| 87 |
+
|
| 88 |
+
BER = (message_npy != message_decoded_npy).mean() * 100
|
| 89 |
+
print("BER:", BER)
|
| 90 |
+
```
|
| 91 |
+
|
| 92 |
+
|
| 93 |
+
|
| 94 |
+
|
| 95 |
+
|
| 96 |
+
## Thanks
|
| 97 |
+
The "[Audiowmark](https://uplex.de/audiowmark)" developed by Stefan Westerfeld has provided valuable ideas for the design of this project.
|
| 98 |
+
## Citation
|
| 99 |
+
```
|
| 100 |
+
@misc{chen2023wavmark,
|
| 101 |
+
title={WavMark: Watermarking for Audio Generation},
|
| 102 |
+
author={Guangyu Chen and Yu Wu and Shujie Liu and Tao Liu and Xiaoyong Du and Furu Wei},
|
| 103 |
+
year={2023},
|
| 104 |
+
eprint={2308.12770},
|
| 105 |
+
archivePrefix={arXiv},
|
| 106 |
+
primaryClass={cs.SD}
|
| 107 |
+
}
|
| 108 |
+
```
|