File size: 4,646 Bytes
3c5d488 c73af92 65f24fb c73af92 7c9b753 c73af92 47006b8 65f24fb 3c5d488 c73af92 3c5d488 c73af92 3c5d488 c73af92 3c5d488 65f24fb 3c5d488 c73af92 3c5d488 ec67d60 3c5d488 ec67d60 3c5d488 c73af92 65f24fb c73af92 65f24fb c73af92 7c9b753 47006b8 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 | ---
license: cc-by-nc-4.0
tags:
- audio-classification
- ai-music-detection
- forensic
- onnx
language:
- en
pipeline_tag: audio-classification
---
# ArtifactNet v9.4 β AI-Generated Music Forensic Detection
ArtifactNet detects AI-generated music by extracting forensic residual artifacts via a task-specific UNet, rather than learning generator-specific patterns. This approach generalizes across 22 AI music generators with only 4.2M parameters.
> β οΈ **License: CC BY-NC 4.0 β Non-Commercial Only**
> This ONNX inference build may not be used for any commercial product, service, API, or
> revenue-generating activity. Research, academic, and personal evaluation use are welcome.
> For commercial licensing, contact: **contact@intrect.io**
> π‘οΈ **Patent Pending (KR + PCT)**
> The bounded-mask residual extraction and codec-invariant training methods used in
> ArtifactNet are covered by pending patent applications. Use of the ONNX build under
> CC BY-NC 4.0 grants no patent license; commercial deployment requires both a
> commercial license and a patent license (contact above for both).
> βΉοΈ **What is released**
> A pre-compiled, end-to-end **ONNX inference build** of the full pipeline (STFT β UNet β
> HPSS β 7-channel CNN β sigmoid). Raw PyTorch weights, training code, and training data
> are **not** publicly released. This is a deliberate scope limitation β the released
> binary is sufficient to reproduce inference numbers reported in our paper, but does
> not enable fine-tuning or weight extraction.
## Model Description
- **Architecture**: ArtifactUNet (3.6M) + 7ch HPSS Forensic CNN (424K) = 4.2M total
- **Input**: 44.1kHz mono audio, 4-second segments
- **Output**: P(AI) β [0, 1] per segment, song-level median verdict
- **Format**: Single ONNX file (entire pipeline: STFT β UNet β HPSS β 7ch β CNN β sigmoid)
## Performance β ArtifactBench v0.9 (test-only fair eval, all models unseen)
| Metric | ArtifactNet (4.2M) | CLAM (194M) | SpecTTTra (19M) |
|---|---|---|---|
| **F1** | **0.9829** | 0.7576 | 0.7713 |
| **Precision** | 0.9905 | 0.6674 | 0.8519 |
| **Recall (TPR)** | 0.9755 | 0.8761 | 0.7046 |
| **FPR** | 0.0149 | 0.6926 | 0.1943 |
| **AUC** | **0.9974** | 0.7031 | 0.8460 |
| @FPRβ€5% TPR | **99.1%** | - | - |
Evaluated on 2,263 tracks (`bench_origin=test`, unseen by all three models),
threshold Ο=0.5, identical preprocessing.
## Usage
```python
import onnxruntime as ort
import numpy as np
import soundfile as sf
# Load ONNX inference build
sess = ort.InferenceSession("artifactnet_v94_full.onnx")
# Load audio (44.1kHz mono, 4-second chunk)
audio, sr = sf.read("track.wav", dtype="float32")
if audio.ndim > 1:
audio = audio.mean(axis=1)
chunk = audio[:4 * 44100].reshape(1, -1).astype(np.float32)
# Inference
prob = sess.run(None, {"audio": chunk})[0][0]
print(f"P(AI) = {prob:.4f}") # > 0.5 β AI-generated
```
For song-level verdict, compute median over multiple chunks.
## Benchmark
Evaluate with [ArtifactBench v1](https://huggingface.co/datasets/intrect/artifactbench).
## Citation
```bibtex
@article{oh2026artifactnet,
title = {ArtifactNet: Detecting AI-Generated Music via Forensic Residual Physics},
author = {Oh, Heewon},
journal = {arXiv preprint arXiv:2604.16254},
year = {2026},
eprint = {2604.16254},
archivePrefix= {arXiv},
primaryClass = {cs.SD},
doi = {10.48550/arXiv.2604.16254},
url = {https://arxiv.org/abs/2604.16254}
}
```
**arXiv**: [2604.16254](https://arxiv.org/abs/2604.16254) Β· **DOI**: [10.48550/arXiv.2604.16254](https://doi.org/10.48550/arXiv.2604.16254)
## License
**CC BY-NC 4.0** β Free for academic, research, and personal use. **Commercial use is
prohibited** without prior written permission. This includes (but is not limited to):
- Selling access to the ONNX build or its outputs
- Integrating into commercial products, SaaS, or APIs
- Using the model to generate revenue, directly or indirectly
- Attempting to extract weights for derivative commercial models
For commercial licensing inquiries: **contact@intrect.io**
### Patent Notice
Patent applications covering the core methods of ArtifactNet are pending in Korea (KR)
and via the Patent Cooperation Treaty (PCT). The CC BY-NC 4.0 license on this ONNX
inference build does **not** convey any patent rights. Commercial use, even under a
commercial copyright license, requires a separate patent license. Academic and
research use within the scope of CC BY-NC 4.0 is permitted without separate patent
license, consistent with standard research-use exemptions.
|