CED (GGUF) for ced.cpp / LocalAI

GGUF quantizations of the CED family (Consistent Ensemble Distillation, Xiaomi) - SOTA-tier audio-tagging models that classify everyday sounds (baby cry, footsteps, glass breaking, alarms, dog bark, ...) into the 527-class AudioSet ontology.

These files run with ced.cpp, a standalone C++/ggml port (no Python, no PyTorch at inference), and with LocalAI via the ced backend. Converted from the mispeech/ced-* checkpoints (Apache-2.0). CED is a plain AST/DeiT Vision Transformer over a log-mel spectrogram; the port is numerically equal to the PyTorch reference.

Files

One self-contained GGUF per size + quant (config, 527 labels, and the mel filterbank/window are all embedded). Pick by your accuracy/size budget:

size	params	f16	q8_0	f32
tiny	5.5M	`ced-tiny-f16.gguf` (11 MB)	`ced-tiny-q8_0.gguf` (6 MB)	-
mini	9.6M	`ced-mini-f16.gguf` (19 MB)	`ced-mini-q8_0.gguf` (11 MB)	-
small	22M	`ced-small-f16.gguf` (42 MB)	`ced-small-q8_0.gguf` (23 MB)	-
base	86M	`ced-base-f16.gguf` (165 MB)	`ced-base-q8_0.gguf` (88 MB)	`ced-base-f32.gguf` (328 MB)

tiny/q8_0 (6 MB) is ideal for Raspberry-Pi-class CPUs; base/f16 is the accuracy default.

Parity vs PyTorch (ced-base, end-to-end probs)

quant	max abs diff	top-5 tags
f32	1.7e-7	identical
f16	6.4e-5	identical
q8_0	6.0e-3	identical

Performance (CPU, ced-base, 10s clip, Ryzen 9 9950X3D, 4 threads)

	latency	realtime factor	peak RSS
PyTorch (transformers, f32)	155.7 ms	65x	717 MB
ced.cpp f16	100.6 ms	100x	189 MB
ced.cpp q8_0	117.2 ms	86x	111 MB

ced.cpp f16 is ~1.55x faster than the PyTorch reference; q8_0 uses ~6.5x less memory.

Usage

ced-cli classify ced-base-f16.gguf clip.wav --top-k 5
# 0.87  Baby cry, infant cry
# 0.12  Crying, sobbing

In LocalAI: install the ced backend, configure a model with one of these GGUFs, then call POST /v1/audio/classification (or stream over the realtime websocket API for live recognition).

License

Model weights: Apache-2.0 (© Xiaomi Corporation; from the mispeech/ced-* checkpoints). AudioSet labels are CC-BY-4.0. The ced.cpp inference code is MIT.

Downloads last month: -

GGUF

Model size

85.7M params

Architecture

ced

Hardware compatibility

8-bit

16-bit

32-bit

Model tree for mudler/ced-gguf

Base model

mispeech/ced-base

Quantized

(1)

this model