SYSTEM BENCHMARKS

Performance Metrics

TESTBED: RTX 4060 Ti • 16GB RAM • UBUNTU 22.04

PROCESSING THROUGHPUT
60× Real-time

Processes 1 hour of audio in ~1 minute. Massive parallelism using batched GPU inference.

OUR SYSTEM (1m) REAL-TIME (60m)
STREAMING LATENCY
200ms

Chunk processing delay
Feels instantaneous.

VRAM USAGE
3.5GB

Runs on consumer GPUs
(FP16 Quantized).

COST EFFICIENCY
$0.02
CLOUD / MIN
$0.00
OURS / MIN
OPTIMIZED FOR EDGE
CUDA / ONNX
Dockerized