Buckets:
| <meta charset="utf-8" /><meta name="hf:doc:metadata" content="{"title":"Audio Classification","local":"audio-classification","sections":[{"title":"Recommended models","local":"recommended-models","sections":[],"depth":3},{"title":"Using the API","local":"using-the-api","sections":[],"depth":3},{"title":"API specification","local":"api-specification","sections":[{"title":"Request","local":"request","sections":[],"depth":4},{"title":"Response","local":"response","sections":[],"depth":4}],"depth":3}],"depth":2}"> | |
| <link href="/docs/inference-providers/pr_1663/en/_app/immutable/assets/0.e3b0c442.css" rel="modulepreload"> | |
| <link rel="modulepreload" href="/docs/inference-providers/pr_1663/en/_app/immutable/entry/start.d5f15666.js"> | |
| <link rel="modulepreload" href="/docs/inference-providers/pr_1663/en/_app/immutable/chunks/scheduler.ddb4e551.js"> | |
| <link rel="modulepreload" href="/docs/inference-providers/pr_1663/en/_app/immutable/chunks/singletons.0f5b782d.js"> | |
| <link rel="modulepreload" href="/docs/inference-providers/pr_1663/en/_app/immutable/chunks/index.ce98237b.js"> | |
| <link rel="modulepreload" href="/docs/inference-providers/pr_1663/en/_app/immutable/chunks/paths.b324c1e2.js"> | |
| <link rel="modulepreload" href="/docs/inference-providers/pr_1663/en/_app/immutable/entry/app.68b4644d.js"> | |
| <link rel="modulepreload" href="/docs/inference-providers/pr_1663/en/_app/immutable/chunks/index.e16e4efa.js"> | |
| <link rel="modulepreload" href="/docs/inference-providers/pr_1663/en/_app/immutable/nodes/0.80863911.js"> | |
| <link rel="modulepreload" href="/docs/inference-providers/pr_1663/en/_app/immutable/chunks/each.e59479a4.js"> | |
| <link rel="modulepreload" href="/docs/inference-providers/pr_1663/en/_app/immutable/nodes/8.3c32c7fd.js"> | |
| <link rel="modulepreload" href="/docs/inference-providers/pr_1663/en/_app/immutable/chunks/Tip.20abb04f.js"> | |
| <link rel="modulepreload" href="/docs/inference-providers/pr_1663/en/_app/immutable/chunks/index.e108c5ed.js"><!-- HEAD_svelte-u9bgzb_START --><meta name="hf:doc:metadata" content="{"title":"Audio Classification","local":"audio-classification","sections":[{"title":"Recommended models","local":"recommended-models","sections":[],"depth":3},{"title":"Using the API","local":"using-the-api","sections":[],"depth":3},{"title":"API specification","local":"api-specification","sections":[{"title":"Request","local":"request","sections":[],"depth":4},{"title":"Response","local":"response","sections":[],"depth":4}],"depth":3}],"depth":2}"><!-- HEAD_svelte-u9bgzb_END --> <p></p> <h2 class="relative group"><a id="audio-classification" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#audio-classification"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Audio Classification</span></h2> <p data-svelte-h="svelte-1ny8y9l">Audio classification is the task of assigning a label or class to a given audio.</p> <p data-svelte-h="svelte-1iml56d">Example applications:</p> <ul data-svelte-h="svelte-zjmuse"><li>Recognizing which command a user is giving</li> <li>Identifying a speaker</li> <li>Detecting the genre of a song</li></ul> <div class="course-tip bg-gradient-to-br dark:bg-gradient-to-r before:border-green-500 dark:before:border-green-800 from-green-50 dark:from-gray-900 to-white dark:to-gray-950 border border-green-50 text-green-700 dark:text-gray-400"><p data-svelte-h="svelte-1upmpac">For more details about the <code>audio-classification</code> task, check out its <a href="https://huggingface.co/tasks/audio-classification" rel="nofollow">dedicated page</a>! You will find examples and related materials.</p></div> <h3 class="relative group"><a id="recommended-models" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#recommended-models"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Recommended models</span></h3> <ul data-svelte-h="svelte-1g0trst"><li><a href="https://huggingface.co/speechbrain/google_speech_command_xvector" rel="nofollow">speechbrain/google_speech_command_xvector</a>: An easy-to-use model for command recognition.</li> <li><a href="https://huggingface.co/ehcalabres/wav2vec2-lg-xlsr-en-speech-emotion-recognition" rel="nofollow">ehcalabres/wav2vec2-lg-xlsr-en-speech-emotion-recognition</a>: An emotion recognition model.</li> <li><a href="https://huggingface.co/facebook/mms-lid-126" rel="nofollow">facebook/mms-lid-126</a>: A language identification model.</li></ul> <p data-svelte-h="svelte-1n75j2e">Explore all available models and find the one that suits you best <a href="https://huggingface.co/models?inference=warm&pipeline_tag=audio-classification&sort=trending" rel="nofollow">here</a>.</p> <h3 class="relative group"><a id="using-the-api" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#using-the-api"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Using the API</span></h3> <p data-svelte-h="svelte-1kehkb7">No snippet available for this task.</p> <h3 class="relative group"><a id="api-specification" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#api-specification"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>API specification</span></h3> <h4 class="relative group"><a id="request" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#request"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Request</span></h4> <table data-svelte-h="svelte-2n8vhn"><thead><tr><th align="left">Payload</th> <th align="left"></th> <th align="left"></th></tr></thead> <tbody><tr><td align="left"><strong>inputs*</strong></td> <td align="left"><em>string</em></td> <td align="left">The input audio data as a base64-encoded string. If no <code>parameters</code> are provided, you can also provide the audio data as a raw bytes payload.</td></tr> <tr><td align="left"><strong>parameters</strong></td> <td align="left"><em>object</em></td> <td align="left"></td></tr> <tr><td align="left"><strong> function_to_apply</strong></td> <td align="left"><em>enum</em></td> <td align="left">Possible values: sigmoid, softmax, none.</td></tr> <tr><td align="left"><strong> top_k</strong></td> <td align="left"><em>integer</em></td> <td align="left">When specified, limits the output to the top K most probable classes.</td></tr></tbody></table> <p data-svelte-h="svelte-xa4wks">Some options can be configured by passing headers to the Inference API. Here are the available headers:</p> <table data-svelte-h="svelte-2rfiu7"><thead><tr><th align="left">Headers</th> <th align="left"></th> <th align="left"></th></tr></thead> <tbody><tr><td align="left"><strong>authorization</strong></td> <td align="left"><em>string</em></td> <td align="left">Authentication header in the form <code>'Bearer: hf_****'</code> when <code>hf_****</code> is a personal user access token with Inference API permission. You can generate one from <a href="https://huggingface.co/settings/tokens" rel="nofollow">your settings page</a>.</td></tr> <tr><td align="left"><strong>x-use-cache</strong></td> <td align="left"><em>boolean, default to <code>true</code></em></td> <td align="left">There is a cache layer on the inference API to speed up requests we have already seen. Most models can use those results as they are deterministic (meaning the outputs will be the same anyway). However, if you use a nondeterministic model, you can set this parameter to prevent the caching mechanism from being used, resulting in a real new query. Read more about caching <a href="../parameters#caching%5D">here</a>.</td></tr> <tr><td align="left"><strong>x-wait-for-model</strong></td> <td align="left"><em>boolean, default to <code>false</code></em></td> <td align="left">If the model is not ready, wait for it instead of receiving 503. It limits the number of requests required to get your inference done. It is advised to only set this flag to true after receiving a 503 error, as it will limit hanging in your application to known places. Read more about model availability <a href="../overview#eligibility%5D">here</a>.</td></tr></tbody></table> <p data-svelte-h="svelte-1ps9cb1">For more information about Inference API headers, check out the parameters <a href="../parameters">guide</a>.</p> <h4 class="relative group"><a id="response" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#response"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Response</span></h4> <table data-svelte-h="svelte-1mk6wu5"><thead><tr><th align="left">Body</th> <th align="left"></th> <th align="left"></th></tr></thead> <tbody><tr><td align="left"><strong>(array)</strong></td> <td align="left"><em>object[]</em></td> <td align="left">Output is an array of objects.</td></tr> <tr><td align="left"><strong> label</strong></td> <td align="left"><em>string</em></td> <td align="left">The predicted class label.</td></tr> <tr><td align="left"><strong> score</strong></td> <td align="left"><em>number</em></td> <td align="left">The corresponding probability.</td></tr></tbody></table> <a class="!text-gray-400 !no-underline text-sm flex items-center not-prose mt-4" href="https://github.com/huggingface/hub-docs/blob/main/docs/inference-providers/tasks/audio-classification.md" target="_blank"><span data-svelte-h="svelte-1kd6by1"><</span> <span data-svelte-h="svelte-x0xyl0">></span> <span data-svelte-h="svelte-1dajgef"><span class="underline ml-1.5">Update</span> on GitHub</span></a> <p></p> | |
| <script> | |
| { | |
| __sveltekit_1o5mypj = { | |
| assets: "/docs/inference-providers/pr_1663/en", | |
| base: "/docs/inference-providers/pr_1663/en", | |
| env: {} | |
| }; | |
| const element = document.currentScript.parentElement; | |
| const data = [null,null]; | |
| Promise.all([ | |
| import("/docs/inference-providers/pr_1663/en/_app/immutable/entry/start.d5f15666.js"), | |
| import("/docs/inference-providers/pr_1663/en/_app/immutable/entry/app.68b4644d.js") | |
| ]).then(([kit, app]) => { | |
| kit.start(app, element, { | |
| node_ids: [0, 8], | |
| data, | |
| form: null, | |
| error: null | |
| }); | |
| }); | |
| } | |
| </script> | |
Xet Storage Details
- Size:
- 15.8 kB
- Xet hash:
- 77c2bf86e3fbe5b3af8221fc6a881f215e70f860a965ad77d23beb5fabdd28a9
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.