Perch v2 PyTorch — TF SavedModel Source (wrice/perch-v2-pytorch-tflite)

Google Perch v2 EfficientNet-B3 backbone, ported to PyTorch with weights extracted directly from the TF SavedModel (). No ONNX intermediary. Drop-in replacement for .

Source model

TF SavedModel:
TFLite variant: checked on Kaggle — not available for this model.

Precision achieved

Tested on 5 BirdCLEF 2026 train soundscape files (12 clips × 160000 samples each):

Metric	Value
max_abs_diff vs TF SavedModel	~8.88e-6
atol=1e-5 pass	✓ ALL 5 files
atol=1e-6 pass	✗ (structural floor)

Why atol=1e-7 is not achievable

Two irreducible float32 rounding differences confirmed by float64 control tests:

** (mel spectrogram)**: TF XLA FFT kernel uses different float32 accumulation order than PyTorch. Float64 control: diff drops 9e-6 → 6e-6 (0.67x, NOT orders of magnitude). Structural: TF XLA vs FFTW/KissFFT.
Conv2d + BatchNorm2d chain (backbone): TF XLA fuses conv+BN into a single FMA kernel; PyTorch keeps them separate. Float64 control: embedding diff stays at ~4.5e-7 (0.9x from 5e-7, essentially unchanged). Structural: different FP32 evaluation order.

Weights relationship to wrice/perch-v2-pytorch

The ONNX export of the TF SavedModel preserves float32 values bit-for-bit. Consequently, the two repos carry numerically identical weights (max weight diff = 0.00e+00). The distinction is provenance: this repo was extracted directly from the TF SavedModel without the ONNX intermediary.

Usage

Files

: 43.2 MB — PyTorch weights
: ~261 KB — mel window + filterbank constants

Downloads last month: -; Downloads are not tracked for this model. How to track