Buckets:

hf-doc-build
/

doc-dev

Files

xet

hf-doc-build/doc-dev / course /pr_1114 /en /chapter9 /3.md

rtrm

about 2 months ago

preview code

download

raw

9.26 kB

	# Understanding the Interface class[[understanding-the-interface-class]]

	<CourseFloatingBanner chapter={9}
	classNames="absolute z-10 right-0 top-0"
	notebooks={[
	{label: "Google Colab", value: "https://colab.research.google.com/github/huggingface/notebooks/blob/master/course/en/chapter9/section3.ipynb"},
	{label: "Aws Studio", value: "https://studiolab.sagemaker.aws/import/github/huggingface/notebooks/blob/master/course/en/chapter9/section3.ipynb"},
	]} />

	In this section, we will take a closer look at the `Interface` class, and understand the
	main parameters used to create one.

	## How to create an Interface[[how-to-create-an-interface]]

	You'll notice that the `Interface` class has 3 required parameters:

	`Interface(fn, inputs, outputs, ...)`

	These parameters are:

	- `fn`: the prediction function that is wrapped by the Gradio interface. This function can take one or more parameters and return one or more values
	- `inputs`: the input component type(s). Gradio provides many pre-built components such as`"image"` or `"mic"`.
	- `outputs`: the output component type(s). Again, Gradio provides many pre-built components e.g. `"image"` or `"label"`.

	For a complete list of components, [see the Gradio docs ](https://gradio.app/docs). Each pre-built component can be customized by instantiating the class corresponding to the component.

	For example, as we saw in the [previous section](/course/chapter9/2),
	instead of passing in `"textbox"` to the `inputs` parameter, you can pass in a `Textbox(lines=7, label="Prompt")` component to create a textbox with 7 lines and a label.

	Let's take a look at another example, this time with an `Audio` component.

	## A simple example with audio[[a-simple-example-with-audio]]

	As mentioned earlier, Gradio provides many different inputs and outputs.
	So let's build an `Interface` that works with audio.

	In this example, we'll build an audio-to-audio function that takes an
	audio file and simply reverses it.

	We will use for the input the `Audio` component. When using the `Audio` component,
	you can specify whether you want the `source` of the audio to be a file that the user
	uploads or a microphone that the user records their voice with. In this case, let's
	set it to a `"microphone"`. Just for fun, we'll add a label to our `Audio` that says
	"Speak here...".

	In addition, we'd like to receive the audio as a numpy array so that we can easily
	"reverse" it. So we'll set the `"type"` to be `"numpy"`, which passes the input
	data as a tuple of (`sample_rate`, `data`) into our function.

	We will also use the `Audio` output component which can automatically
	render a tuple with a sample rate and numpy array of data as a playable audio file.
	In this case, we do not need to do any customization, so we will use the string
	shortcut `"audio"`.


	```py
	import numpy as np
	import gradio as gr


	def reverse_audio(audio):
	sr, data = audio
	reversed_audio = (sr, np.flipud(data))
	return reversed_audio


	mic = gr.Audio(source="microphone", type="numpy", label="Speak here...")
	gr.Interface(reverse_audio, mic, "audio").launch()
	```

	The code above will produce an interface like the one below (if your browser doesn't
	ask you for microphone permissions, <a href="https://huggingface.co/spaces/course-demos/audio-reverse" target="_blank">open the demo in a separate tab</a>.)

	<iframe src="https://course-demos-audio-reverse.hf.space" frameBorder="0" height="250" title="Gradio app" class="container p-0 flex-grow space-iframe" allow="accelerometer; ambient-light-sensor; autoplay; battery; camera; document-domain; encrypted-media; fullscreen; geolocation; gyroscope; layout-animations; legacy-image-formats; magnetometer; microphone; midi; oversized-images; payment; picture-in-picture; publickey-credentials-get; sync-xhr; usb; vr ; wake-lock; xr-spatial-tracking" sandbox="allow-forms allow-modals allow-popups allow-popups-to-escape-sandbox allow-same-origin allow-scripts allow-downloads"></iframe>

	You should now be able to record your voice and hear yourself speaking in reverse - spooky 👻!

	## Handling multiple inputs and outputs[[handling-multiple-inputs-and-outputs]]

	Let's say we had a more complicated function, with multiple inputs and outputs.
	In the example below, we have a function that takes a dropdown index, a slider value, and number,
	and returns an audio sample of a musical tone.

	Take a look how we pass a list of input and output components,
	and see if you can follow along what's happening.

	The key here is that when you pass:
	* a list of input components, each component corresponds to a parameter in order.
	* a list of output coponents, each component corresponds to a returned value.

	The code snippet below shows how three input components line up with the three arguments of the `generate_tone()` function:

	```py
	import numpy as np
	import gradio as gr

	notes = ["C", "C#", "D", "D#", "E", "F", "F#", "G", "G#", "A", "A#", "B"]


	def generate_tone(note, octave, duration):
	sr = 48000
	a4_freq, tones_from_a4 = 440, 12 * (octave - 4) + (note - 9)
	frequency = a4_freq * 2 ** (tones_from_a4 / 12)
	duration = int(duration)
	audio = np.linspace(0, duration, duration * sr)
	audio = (20000 * np.sin(audio * (2 * np.pi * frequency))).astype(np.int16)
	return (sr, audio)


	gr.Interface(
	generate_tone,
	[
	gr.Dropdown(notes, type="index"),
	gr.Slider(minimum=4, maximum=6, step=1),
	gr.Number(value=1, label="Duration in seconds"),
	],
	"audio",
	).launch()
	```

	<iframe src="https://course-demos-generate-tone.hf.space" frameBorder="0" height="450" title="Gradio app" class="container p-0 flex-grow space-iframe" allow="accelerometer; ambient-light-sensor; autoplay; battery; camera; document-domain; encrypted-media; fullscreen; geolocation; gyroscope; layout-animations; legacy-image-formats; magnetometer; microphone; midi; oversized-images; payment; picture-in-picture; publickey-credentials-get; sync-xhr; usb; vr ; wake-lock; xr-spatial-tracking" sandbox="allow-forms allow-modals allow-popups allow-popups-to-escape-sandbox allow-same-origin allow-scripts allow-downloads"></iframe>


	### The `launch()` method[[the-launch-method]]

	So far, we have used the `launch()` method to launch the interface, but we
	haven't really discussed what it does.

	By default, the `launch()` method will launch the demo in a web server that
	is running locally. If you are running your code in a Jupyter or Colab notebook, then
	Gradio will embed the demo GUI in the notebook so you can easily use it.

	You can customize the behavior of `launch()` through different parameters:

	- `inline` - whether to display the interface inline on Python notebooks.
	- `inbrowser` - whether to automatically launch the interface in a new tab on the default browser.
	- `share` - whether to create a publicly shareable link from your computer for the interface. Kind of like a Google Drive link!

	We'll cover the `share` parameter in a lot more detail in the next section!

	## ✏️ Let's apply it![[lets-apply-it]]

	Let's build an interface that allows you to demo a speech-recognition model.
	To make it interesting, we will accept either a mic input or an uploaded file.

	As usual, we'll load our speech recognition model using the `pipeline()` function from 🤗 Transformers.
	If you need a quick refresher, you can go back to [that section in Chapter 1](/course/chapter1/3). Next, we'll implement a `transcribe_audio()` function that processes the audio and returns the transcription. Finally, we'll wrap this function in an `Interface` with the `Audio` components for the inputs and just text for the output. Altogether, the code for this application is the following:

	```py
	from transformers import pipeline
	import gradio as gr

	model = pipeline("automatic-speech-recognition")


	def transcribe_audio(audio):
	transcription = model(audio)["text"]
	return transcription


	gr.Interface(
	fn=transcribe_audio,
	inputs=gr.Audio(type="filepath"),
	outputs="text",
	).launch()
	```

	If your browser doesn't ask you for microphone permissions, <a href="https://huggingface.co/spaces/course-demos/audio-reverse" target="_blank">open the demo in a separate tab</a>.

	<iframe src="https://course-demos-asr.hf.space" frameBorder="0" height="550" title="Gradio app" class="container p-0 flex-grow space-iframe" allow="accelerometer; ambient-light-sensor; autoplay; battery; camera; document-domain; encrypted-media; fullscreen; geolocation; gyroscope; layout-animations; legacy-image-formats; magnetometer; microphone; midi; oversized-images; payment; picture-in-picture; publickey-credentials-get; sync-xhr; usb; vr ; wake-lock; xr-spatial-tracking" sandbox="allow-forms allow-modals allow-popups allow-popups-to-escape-sandbox allow-same-origin allow-scripts allow-downloads"></iframe>


	That's it! You can now use this interface to transcribe audio. Notice here that
	by passing in the `optional` parameter as `True`, we allow the user to either
	provide a microphone or an audio file (or neither, but that will return an error message).

	Keep going to see how to share your interface with others!


	<EditOnGithub source="https://github.com/huggingface/course/blob/main/chapters/en/chapter9/3.mdx" />

Xet Storage Details

Size:: 9.26 kB
Xet hash:: 9b22fb402cde2f42e3935c654b69e92d6a51eb418c76aa767088a2ba5901b4c5

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.