FormantNet β€” ONNX

ONNX exports of the FormantNet neural formant tracker (PaPE 2021 / IS 2021), with fp32, fp16, and int8 dynamic-quantization variants.

The exported model is the LSTM1_noIAIF_DFLoss configuration trained on TIMIT: experiment mvt33_f6z1sTpF10 (6 formants + 1 antiformant, delta-frequency loss weight 0.15).

Architecture

Input  (batch, time, 257)   β€” normalized log-spectral envelope, 32 ms / 16 kHz window
LSTM   512 units, unidirectional, return_sequences=True
Dense  20 units, sigmoid
Output (batch, time, 20)    β€” raw sigmoid [0, 1]

Total parameters: 1,587,220 (6.05 MB fp32).

Output layout: F1…F6, Fz1 (frequencies) Β· B1…B6, Bz1 (bandwidths) Β· A1…A6 (amplitudes) β€” all as raw sigmoid values in [0, 1]; rescale with the repo's get_rescale_fn() to obtain Hz / dB.

Files

File Precision Size max_abs vs fp32
formantnet.onnx fp32 6.36 MB 0 (reference)
formantnet_fp16.onnx fp16 3.18 MB 4.1 Γ— 10⁻⁴
formantnet_int8.onnx int8 (dynamic) 1.61 MB 9.2 Γ— 10⁻²

Usage

import numpy as np
import onnxruntime as ort

sess = ort.InferenceSession("formantnet.onnx", providers=["CPUExecutionProvider"])

# x: float32 array of shape (batch, time, 257)
# β€” normalized spectral envelopes (subtract training mean, divide by std)
x = np.random.randn(1, 200, 257).astype(np.float32)
raw_params = sess.run(None, {"input": x})[0]   # (1, 200, 20), values in [0, 1]

Pre-processing (windowing, FFT, envelope smoothing, normalization) and post-processing (rescaling to Hz/dB, formant sorting, binomial smoothing) are not included in the ONNX graph β€” use the scripts from the original FormantNet repository for those steps.

Conversion

  • convert_to_onnx.py β€” reconstructs the Keras model, loads the TF checkpoint, exports to ONNX (opset 15)
  • quantize_onnx.py β€” generates fp16 and int8 variants with parity checks
  • validate_onnx.py β€” shape, range, and numeric equivalence validation

Requires: tensorflow-macos 2.13, tf2onnx 1.17, onnxruntime, onnxconverter-common.

Citation

@inproceedings{sakamoto2021formantnet,
  title     = {Neural Formant Tracking},
  author    = {Sakamoto, Yuki and others},
  booktitle = {Interspeech 2021},
  year      = {2021}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support