Instructions to use philschmid/pyannote-speaker-diarization-endpoint with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- pyannote.audio
How to use philschmid/pyannote-speaker-diarization-endpoint with pyannote.audio:
from pyannote.audio import Pipeline pipeline = Pipeline.from_pretrained("philschmid/pyannote-speaker-diarization-endpoint") # inference on the whole file pipeline("file.wav") # inference on an excerpt from pyannote.core import Segment excerpt = Segment(start=2.0, end=5.0) from pyannote.audio import Audio waveform, sample_rate = Audio().crop("file.wav", excerpt) pipeline({"waveform": waveform, "sample_rate": sample_rate}) - Notebooks
- Google Colab
- Kaggle
Example use with requests/httpx in Python?
Hey @philschmid ,
I am the creator of pyannote.
Could you please share in the README, how one would send requests to this endpoint from Python, once deployed on Huggingface Inference Endpoint?
I have trouble understanding how one should load the audio on client side, and how to send extra parameters as well.
import httpx
with open("sample.wav", "rb") as f:
audio = f.read()
data = {
"inputs": audio,
"parameters": {"num_speakers": 2},
}
headers = {'Authorization': f'Bearer {ENDPOINT_TOKEN}'}
r = httpx.post(ENDPOINT_URL, json=data, headers=headers)
The above won't work because httpx will complain that audio is bytes.
Thanks!
Hervé.
Could you try sending a binary bod with a content-type header?
Something like bekow=
# Path to your audio file
file_path = "path/to/your/audiofile.mp3"
# Read the binary content of the file
with open(file_path, 'rb') as audio_file:
file_content = audio_file.read()
# Get the content type of the file
content_type = mimetypes.guess_type(file_path)[0]
# Prepare headers
headers = {
'Content-Type': content_type
}
# Send POST request
response = requests.post(url, headers=headers, data=file_content)
Thanks but my question was more related to sending both binary audio and json parameters.
With trials and errors, I found a solution on my side (a different but similar endpoint) by going through a base64 conversion and sending the whole thing as JSON.
I believe updating the documentation with an audio+parameter example would help.
Closing as I found a solution.
When sending a binary you can add "query" parameter which will then be passed in as "parameters"