Open-Audiodeleto

Public/open-source lightweight baseline.

This model detects audio patterns similar to known bypass uploads. It does not prove copyright infringement. It does not prove intent. It should not be used as the sole basis for enforcement.

What It Does

This model produces a bypass-audio probability from audio-only features such as duration, RMS statistics, spectral statistics, tempo, silence ratio, and loudness-transition scores.

What It Does Not Do

It does not prove copyright infringement. It does not prove intent. It should not be used for automatic bans, automatic account termination, or legal copyright conclusions. No automatic enforcement should rely only on this model.

Run

python open_audiodeleto.py /path/to/audio.ogg

Example:

python open_audiodeleto.py audio.mp3

Example JSON:

{
  "model": "Open-Audiodeleto",
  "version": "0.1-800",
  "file": "audio.mp3",
  "bypassProbability": 0.87,
  "riskLevel": "high",
  "recommendation": "review",
  "note": "This model detects audio patterns similar to known bypass uploads. It does not prove copyright infringement. It does not prove intent. It should not be used as the sole basis for enforcement."
}

Metrics

Validation:

{
  "accuracy": 0.7,
  "precision": 0.6875,
  "recall": 0.7333333333333333,
  "f1": 0.7096774193548387,
  "confusion_matrix": [
    [
      40,
      20
    ],
    [
      16,
      44
    ]
  ]
}

Test:

{
  "accuracy": 0.7916666666666666,
  "precision": 0.7692307692307693,
  "recall": 0.8333333333333334,
  "f1": 0.8,
  "confusion_matrix": [
    [
      45,
      15
    ],
    [
      10,
      50
    ]
  ]
}

Files

model.pkl
metadata.json
features.csv
splits.csv
results.json
misclassified_files.csv
invalid_files.csv
open_audiodeleto.py
requirements.txt

Purpose

Public lightweight baseline audio-bypass risk classifier.

Audiodeleto is an audio-bypass risk signal only. It does not prove copyright infringement, intent, or a policy violation, and should not be the sole basis for enforcement.

Model Files

bin/audiodeleto.py: command-line scorer.
model/model.pkl: Python/scikit-learn model artifact.
model/model.onnx: cross-platform ONNX model artifact.
model/metadata.json: thresholds, version, feature configuration, calibration metadata.
model/feature_spec.json: canonical feature names and order.
reports/: validation, ONNX export, and misclassification reports.
training/: feature rows and split files used to package this release.

Runtime Artifacts

Python usage (`model.pkl`)

Use model/model.pkl with the exact feature order in model/feature_spec.json when running inside Python:

import json
import joblib
import pandas as pd

model = joblib.load("model/model.pkl")
feature_spec = json.load(open("model/feature_spec.json", encoding="utf-8"))
features = pd.DataFrame([values], columns=feature_spec["featureNames"])
probability = float(model.predict_proba(features)[0, 1])

Cross-platform usage (`model.onnx`)

Use model/model.onnx from any ONNX Runtime host. The ONNX graph expects precomputed audio features, not raw audio.

Model ID: open-audiodeleto
Version: 0.1-800
Input name: features
Input type: float32
Input shape: [batch, 15]
Feature order: feature_spec.json
Probability output: probabilities[:, 1]

Feature spec warning

Do not infer feature order from CSV column order or object key order. Always use model/feature_spec.json; a wrong feature order can produce valid-looking but incorrect probabilities.

Downloads last month: -; Downloads are not tracked for this model. How to track

Evaluation results

accuracy
self-reported

0.700
precision
self-reported

0.688
recall
self-reported

0.733
f1
self-reported

0.710