Perch v2 PyTorch β TF SavedModel Source (wrice/perch-v2-pytorch-tflite)
Google Perch v2 EfficientNet-B3 backbone, ported to PyTorch with weights extracted directly from the TF SavedModel (). No ONNX intermediary. Drop-in replacement for .
Source model
TF SavedModel:
TFLite variant: checked on Kaggle β not available for this model.
Precision achieved
Tested on 5 BirdCLEF 2026 train soundscape files (12 clips Γ 160000 samples each):
| Metric | Value |
|---|---|
| max_abs_diff vs TF SavedModel | ~8.88e-6 |
| atol=1e-5 pass | β ALL 5 files |
| atol=1e-6 pass | β (structural floor) |
Why atol=1e-7 is not achievable
Two irreducible float32 rounding differences confirmed by float64 control tests:
** (mel spectrogram)**: TF XLA FFT kernel uses different float32 accumulation order than PyTorch. Float64 control: diff drops 9e-6 β 6e-6 (0.67x, NOT orders of magnitude). Structural: TF XLA vs FFTW/KissFFT.
Conv2d + BatchNorm2d chain (backbone): TF XLA fuses conv+BN into a single FMA kernel; PyTorch keeps them separate. Float64 control: embedding diff stays at ~4.5e-7 (0.9x from 5e-7, essentially unchanged). Structural: different FP32 evaluation order.
Weights relationship to wrice/perch-v2-pytorch
The ONNX export of the TF SavedModel preserves float32 values bit-for-bit. Consequently, the two repos carry numerically identical weights (max weight diff = 0.00e+00). The distinction is provenance: this repo was extracted directly from the TF SavedModel without the ONNX intermediary.
Usage
Files
- : 43.2 MB β PyTorch weights
- : ~261 KB β mel window + filterbank constants