gzivdo commited on
Commit
5800a2b
·
verified ·
1 Parent(s): 5d5add8

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +137 -0
README.md ADDED
@@ -0,0 +1,137 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - audio
5
+ - pitch-estimation
6
+ - f0
7
+ - vocal
8
+ - onnx
9
+ - fcpe
10
+ language:
11
+ - en
12
+ library_name: onnxruntime
13
+ pipeline_tag: audio-to-audio
14
+ ---
15
+
16
+ # FCPE ONNX — unofficial export
17
+
18
+ Pre-converted ONNX export of [FCPE](https://github.com/CNChTu/FCPE)
19
+ (Fast Context-based Pitch Estimation, CN_ChiTu, arXiv 2509.15140).
20
+
21
+ This is an **unofficial community export** of the bundled torchfcpe
22
+ checkpoint, intended for use without the PyTorch dependency. The
23
+ weights and architecture are unchanged — only the runtime is
24
+ swapped from torch to ONNX Runtime.
25
+
26
+ ## Provenance
27
+
28
+ - **Upstream code & weights**: <https://github.com/CNChTu/FCPE>
29
+ (MIT — see [LICENSE](https://github.com/CNChTu/FCPE/blob/main/LICENSE))
30
+ - **Upstream paper**: Tu, "FCPE: A Fast Context-based Pitch Estimation
31
+ Model", arXiv [2509.15140](https://arxiv.org/abs/2509.15140), 2025
32
+ - **Bundled checkpoint version**: `torchfcpe == 0.0.4` (PyPI)
33
+ - **Export script** (this conversion): [pitch-core/tools/fcpe_export.py](https://github.com/gzivdo/pitch-core/blob/main/tools/fcpe_export.py)
34
+ (MIT OR Apache-2.0, copyright 2026 gzivdo)
35
+ - **Reproduction**: `python tools/fcpe_export.py --out fcpe.onnx`
36
+ (requires `pip install torch torchfcpe`)
37
+
38
+ This export is **not endorsed by, affiliated with, or sponsored by**
39
+ the FCPE authors. It is provided as a convenience for the open-source
40
+ community.
41
+
42
+ ## I/O contract
43
+
44
+ ```
45
+ input: audio float32 [1, n_samples, 1] raw mono audio @ 16 kHz
46
+ output: f0_hz float32 [1, n_frames, 1] f0 in Hz (0 = unvoiced)
47
+ ```
48
+
49
+ - Sample rate: **16 000 Hz** (resample your input before feeding)
50
+ - Hop: **160 samples** = 10 ms
51
+ - Output frames: `n_samples // 160 + 1`
52
+ - Voicing gate: model applies internal `threshold=0.006` on confidence;
53
+ frames with confidence below it are returned as `f0=0`. Some quiet
54
+ frames may also return `NaN` (internal `log(0)`) — treat as unvoiced.
55
+
56
+ ## Usage (Python)
57
+
58
+ ```python
59
+ import numpy as np
60
+ import onnxruntime as ort
61
+ import librosa
62
+
63
+ audio, _ = librosa.load("vocal.wav", sr=16_000, mono=True)
64
+ sess = ort.InferenceSession("fcpe.onnx", providers=["CPUExecutionProvider"])
65
+ f0 = sess.run(["f0_hz"], {"audio": audio.astype(np.float32)[None, :, None]})[0]
66
+ f0 = f0[0, :, 0]
67
+ voiced = np.isfinite(f0) & (f0 > 0)
68
+ print(f"voiced: {voiced.sum()}/{len(f0)} frames")
69
+ ```
70
+
71
+ ## Usage (Rust via pitch-core-onnx)
72
+
73
+ ```rust
74
+ use pitch_core::PitchTracker;
75
+ use pitch_core_onnx::FcpeEstimator;
76
+
77
+ let est = FcpeEstimator::new("fcpe.onnx")?;
78
+ let mut tracker = PitchTracker::new(est, 48_000, 1024)?;
79
+ for frame in tracker.process(&audio_chunk)? { /* ... */ }
80
+ ```
81
+
82
+ See <https://crates.io/crates/pitch-core-onnx> for the full crate.
83
+
84
+ ## Citation
85
+
86
+ If you use this model in academic work, cite the upstream paper, not
87
+ this export:
88
+
89
+ ```bibtex
90
+ @article{tu2025fcpe,
91
+ title = {FCPE: A Fast Context-based Pitch Estimation Model},
92
+ author = {CN\_ChiTu},
93
+ journal = {arXiv preprint arXiv:2509.15140},
94
+ year = {2025},
95
+ url = {https://arxiv.org/abs/2509.15140}
96
+ }
97
+ ```
98
+
99
+ ## License
100
+
101
+ This ONNX file inherits the MIT license from the FCPE upstream:
102
+
103
+ > MIT License
104
+ >
105
+ > Copyright (c) 2023 CN_ChiTu
106
+ >
107
+ > Permission is hereby granted, free of charge, to any person obtaining
108
+ > a copy of this software and associated documentation files (the
109
+ > "Software"), to deal in the Software without restriction […]
110
+ >
111
+ > THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
112
+ > EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
113
+ > MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
114
+ > NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
115
+ > BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY […]
116
+
117
+ Full text: <https://github.com/CNChTu/FCPE/blob/main/LICENSE>
118
+
119
+ ## Disclaimer
120
+
121
+ The export script `tools/fcpe_export.py` applies a small monkey-patch
122
+ to `torch.stft` so the legacy ONNX tracer can handle the complex-typed
123
+ output from torchfcpe's mel extractor. The patch wraps the real-tensor
124
+ output in a `_FakeComplex` shim that exposes `.real` / `.imag` as
125
+ indexed views — semantically equivalent to the original. Numerical
126
+ output should match the upstream torchfcpe model bit-for-bit modulo
127
+ floating-point rounding in the ORT runtime.
128
+
129
+ This file is provided "AS IS", per the MIT license above. The
130
+ maintainer makes no claims about its accuracy on data outside the
131
+ ranges tested by upstream and provides no warranty of fitness for any
132
+ particular purpose.
133
+
134
+ If the upstream FCPE project releases an official ONNX export, prefer
135
+ that. If you find a discrepancy between this export and upstream
136
+ torchfcpe inference, please open an issue at
137
+ <https://github.com/gzivdo/pitch-core/issues>.