How can I get higher accuracy
I am trying to use your LLM with signals from SIGID wiki website and the results are totally away from 0.99. Can you tell me what do I do wrong?
Here is my code:
waveform, sample_rate = torchaudio.load(wav_path)
# Resample to 16kHz if necessary
if sample_rate != 16000:
resampler = torchaudio.transforms.Resample(orig_freq=sample_rate, new_freq=16000)
waveform = resampler(waveform)
# Preprocess the audio (extract features)
inputs = feature_extractor(waveform.squeeze().numpy(), 16000, return_tensors="pt")
Yea same here, even the Sigwiki files it was trained on is a total miss. I ran a bunch of them and not even close to guessing what it is:
Loading model from local path: D:\Projects\AI\Sigwiki\sigwiki\model\AST_finetuned_SIGIDwiki
Processing audio: .\2G_ALEaudio.mp3
Detected original sampling rate: 44100 Hz
Raw loaded shape: (1317888,) (channels × samples or samples)
After mono/1D conversion shape: (1317888,)
Resampling from 44100 Hz to 16000 Hz...
Final input to extractor shape: (478146,) (must be 1D)
============================================================
Predicted Signal Type : Morse Code (CW)
Confidence : 0.4621 (46.21%)
Top 3 predictions:
- Morse Code (CW) 0.4621 (46.21%)
- STANAG 5065 0.1916 (19.16%)
- Automatic Picture Transmission (APT) 0.0359 (3.59%)
PS D:\Projects\AI\Sigwiki>
S D:\Projects\AI\Sigwiki> python .\sigwiki.py .\29B6_40Hz_USB_5kHz.ogg
Loading model from local path: D:\Projects\AI\Sigwiki\sigwiki\model\AST_finetuned_SIGIDwiki
Processing audio: .\29B6_40Hz_USB_5kHz.ogg
Detected original sampling rate: 48000 Hz
Raw loaded shape: (518400, 2) (channels × samples or samples)
Downmixed multi-channel audio to mono
After mono/1D conversion shape: (518400,)
Resampling from 48000 Hz to 16000 Hz...
Final input to extractor shape: (172800,) (must be 1D)
============================================================
Predicted Signal Type : STANAG 5065
Confidence : 0.4002 (40.02%)
Top 3 predictions:
- STANAG 5065 0.4002 (40.02%)
- 5G "New Radio" cellular network - Downlink 0.2346 (23.46%)
- Digital Audio Broadcasting Plus (DAB+) 0.1490 (14.90%)
PS D:\Projects\AI\Sigwiki>
PS D:\Projects\AI\Sigwiki> python .\sigwiki.py .\Amps_cell_broadcast_decreasedVOL.wav
Loading model from local path: D:\Projects\AI\Sigwiki\sigwiki\model\AST_finetuned_SIGIDwiki
Processing audio: .\Amps_cell_broadcast_decreasedVOL.wav
Detected original sampling rate: 96000 Hz
Raw loaded shape: (1217802,) (channels × samples or samples)
After mono/1D conversion shape: (1217802,)
Resampling from 96000 Hz to 16000 Hz...
Final input to extractor shape: (202967,) (must be 1D)
============================================================
Predicted Signal Type : 5G "New Radio" cellular network - Downlink
Confidence : 0.5153 (51.53%)
Top 3 predictions:
- 5G "New Radio" cellular network - Downlink 0.5153 (51.53%)
- Digital Mobile Radio (DMR) 0.0775 (7.75%)
- STANAG 5065 0.0659 (6.59%)
PS D:\Projects\AI\Sigwiki>