Buckets:

hf-doc-build
/

doc-dev

Files

xet

hf-doc-build/doc-dev / transformers.js /pr_1665 /en /api /processors.md

HuggingFaceDocBuilder

9 days ago

preview code

download

raw

20.6 kB

	# processors

	Processors turn raw inputs (images, audio, text) into the tensor
	shapes a model expects. Pipelines pick the right processor automatically;
	call one directly only when you need to preprocess without running
	inference.

	Three `Auto*` entry points cover the common cases:
	- `AutoProcessor` — multi-modal (tokenizer + image/audio), e.g. Whisper, CLIP.
	- `AutoImageProcessor` — vision-only models.
	- `AutoFeatureExtractor` — audio-only models.

	Example: Prepare audio for Whisper.
	```javascript
	import { AutoProcessor, load_audio } from '@huggingface/transformers';

	const processor = await AutoProcessor.from_pretrained('onnx-community/whisper-tiny.en');
	const audio = await load_audio('https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac', 16000);
	const { input_features } = await processor(audio);
	// Tensor {
	// data: Float32Array(240000) [0.4752984642982483, 0.5597258806228638, 0.56434166431427, ...],
	// dims: [1, 80, 3000],
	// type: 'float32',
	// size: 240000,
	// }
	```

	## On this page

	Classes — [`FeatureExtractor`](#module_processors.FeatureExtractor) · [`ImageProcessor`](#module_processors.ImageProcessor) · [`AutoFeatureExtractor`](#module_processors.AutoFeatureExtractor) · [`AutoImageProcessor`](#module_processors.AutoImageProcessor) · [`AutoProcessor`](#module_processors.AutoProcessor) · [`Processor`](#module_processors.Processor)

	## Classes

	### FeatureExtractor

	Base class for audio feature extractors.

	#### `FeatureExtractor.constructor(config)`

	Create a feature extractor from a parsed `preprocessor_config.json`.

	Parameters

	- `config` (`Object`) — The configuration for the feature extractor.

	#### `FeatureExtractor.from_pretrained(pretrained_model_name_or_path, options)`

	Instantiate one of the feature extractor classes of the library from a pretrained model.

	The feature extractor class to instantiate is selected based on the `feature_extractor_type` property of
	the config object (either passed as an argument or loaded from `pretrained_model_name_or_path` if possible)

	Parameters

	- `pretrained_model_name_or_path` (`string`) — The name or path of the pretrained model. Can be either:
	- A string, the model ID of a pretrained feature extractor hosted inside a model repo on huggingface.co.
	Valid model IDs can be located at the root level, like `bert-base-uncased`, or namespaced under a
	user or organization name, like `dbmdz/bert-base-german-cased`.
	- A path to a directory containing feature_extractor files, e.g., `./my_model_directory/`.
	- `options` ([`PretrainedOptions`](./utils/hub#module_utils/hub.PretrainedOptions)) — Additional options for loading the feature_extractor.

	Returns: `Promise`<[`FeatureExtractor`](./processors#module_processors.FeatureExtractor)> — A new feature extractor instance.

	### ImageProcessor

	Base class for image processors.

	#### `ImageProcessor(images, args)`

	Preprocess one or more images and batch the result into `pixel_values`.

	Parameters

	- `images` ([`RawImage[]?`](./utils/image.md#module_utils/image.RawImage)) — The image or images to preprocess.
	- `args` (`...any`) — Additional arguments.

	Returns: `Promise`<[`ImageProcessorResult`](./processors#module_processors.ImageProcessorResult)> — An object containing the concatenated pixel values (and other metadata) of the preprocessed images.

	#### `ImageProcessor.constructor(config)`

	Create an image processor from a parsed `preprocessor_config.json`.

	Parameters

	- `config` ([`ImageProcessorConfig`](./processors#module_processors.ImageProcessorConfig)) — The configuration object.

	#### `ImageProcessor.thumbnail(image, size, [resample])`

	Resize the image to make a thumbnail. The image is resized so that no dimension is larger than any
	corresponding dimension of the specified size.

	Parameters

	- `image` ([`RawImage`](./utils/image#module_utils/image.RawImage)) — The image to be resized.
	- `size` (`{height:number, width:number}`) — The size `{"height": h, "width": w}` to resize the image to.
	- `resample` (`string` \| `0` \| `1` \| `2` \| `3` \| `4` \| `5`) _optional_ — defaults to `2` — The resampling filter to use.

	Returns: `Promise`<[`RawImage`](./utils/image#module_utils/image.RawImage)> — The resized image.

	#### `ImageProcessor.crop_margin(image, gray_threshold)`

	Crops the margin of the image. Gray pixels are considered margin (i.e., pixels with a value below the threshold).

	Parameters

	- `image` ([`RawImage`](./utils/image#module_utils/image.RawImage)) — The image to be cropped.
	- `gray_threshold` (`number`) — Value below which pixels are considered to be gray.

	Returns: `Promise`<[`RawImage`](./utils/image#module_utils/image.RawImage)> — The cropped image.

	#### `ImageProcessor.pad_image(pixelData, imgDims, padSize, options)`

	Pad the image by a certain amount.

	Parameters

	- `pixelData` (`Float32Array`) — The pixel data to pad.
	- `imgDims` (`number[]`) — The dimensions of the image (height, width, channels).
	- `padSize` (`{width:number; height:number}` \| `number` \| `'square'`) — The dimensions of the padded image.
	- `options` (`Object`) — The options for padding.
	- `mode` (`'constant'` \| `'symmetric'`) _optional_ — defaults to `'constant'` — The type of padding to add.
	- `center` (`boolean`) _optional_ — defaults to `false` — Whether to center the image.
	- `constant_values` (`number[]?`) _optional_ — defaults to `0` — The constant value to use for padding.

	Returns: [`Float32Array`, `number[]`] — The padded pixel data and image dimensions.

	#### `ImageProcessor.rescale(pixelData)`

	Rescale the image pixel values by `this.rescale_factor`.

	Parameters

	- `pixelData` (`Float32Array`) — The pixel data to rescale.

	Returns: `void`

	#### `ImageProcessor.get_resize_output_image_size(image, size)`

	Find the target (width, height) dimension of the output image after
	resizing given the input image and the desired size.

	Parameters

	- `image` ([`RawImage`](./utils/image#module_utils/image.RawImage)) — The image to resize.
	- `size` (`any`) — The size to use for resizing the image.

	Returns: [`number`, `number`] — The target (width, height) dimension of the output image after resizing.

	#### `ImageProcessor.resize(image)`

	Resizes the image.

	Parameters

	- `image` ([`RawImage`](./utils/image#module_utils/image.RawImage)) — The image to resize.

	Returns: `Promise`<[`RawImage`](./utils/image#module_utils/image.RawImage)> — The resized image.

	#### `ImageProcessor.preprocess(image, overrides)`

	Preprocesses the given image.

	Parameters

	- `image` ([`RawImage`](./utils/image#module_utils/image.RawImage)) — The image to preprocess.
	- `overrides` (`Object`) — The overrides for the preprocessing options.

	Returns: `Promise`<[`PreprocessedImage`](./processors#module_processors.PreprocessedImage)> — The preprocessed image.

	#### `ImageProcessor.from_pretrained(pretrained_model_name_or_path, options)`

	Instantiate one of the processor classes of the library from a pretrained model.

	The processor class to instantiate is selected based on the `image_processor_type` (or `feature_extractor_type`; legacy)
	property of the config object (either passed as an argument or loaded from `pretrained_model_name_or_path` if possible)

	Parameters

	- `pretrained_model_name_or_path` (`string`) — The name or path of the pretrained model. Can be either:
	- A string, the model id of a pretrained processor hosted inside a model repo on huggingface.co.
	Valid model ids can be located at the root-level, like `bert-base-uncased`, or namespaced under a
	user or organization name, like `dbmdz/bert-base-german-cased`.
	- A path to a directory containing processor files, e.g., `./my_model_directory/`.
	- `options` ([`PretrainedOptions`](./utils/hub#module_utils/hub.PretrainedOptions)) — Additional options for loading the processor.

	Returns: `Promise`<[`ImageProcessor`](./processors#module_processors.ImageProcessor)> — A new image processor instance.

	### AutoFeatureExtractor

	Loads a feature extractor from a pretrained id. The concrete class is
	selected from the `feature_extractor_type` in `preprocessor_config.json`.
	Most commonly used for audio models.

	```javascript
	import { AutoFeatureExtractor, load_audio } from '@huggingface/transformers';

	const extractor = await AutoFeatureExtractor.from_pretrained('onnx-community/whisper-tiny.en');
	const audio = await load_audio('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/jfk.wav', 16000);
	const { input_features } = await extractor(audio);
	```

	#### `AutoFeatureExtractor.from_pretrained(pretrained_model_name_or_path, options)`

	Instantiate one of the feature extractor classes of the library from a pretrained model.

	The feature extractor class to instantiate is selected based on the `feature_extractor_type` property of
	the config object (either passed as an argument or loaded from `pretrained_model_name_or_path` if possible)

	Parameters

	- `pretrained_model_name_or_path` (`string`) — The name or path of the pretrained model. Can be either:
	- A string, the model ID of a pretrained feature extractor hosted inside a model repo on huggingface.co.
	Valid model IDs can be located at the root level, like `bert-base-uncased`, or namespaced under a
	user or organization name, like `dbmdz/bert-base-german-cased`.
	- A path to a directory containing feature_extractor files, e.g., `./my_model_directory/`.
	- `options` ([`PretrainedOptions`](./utils/hub#module_utils/hub.PretrainedOptions)) — Additional options for loading the feature_extractor.

	Returns: `Promise`<[`FeatureExtractor`](./processors#module_processors.FeatureExtractor)> — A new feature extractor instance.

	### AutoImageProcessor

	Loads an image processor from a pretrained id. The concrete class is
	selected from the `image_processor_type` in `preprocessor_config.json`.

	```javascript
	import { AutoImageProcessor, load_image } from '@huggingface/transformers';

	const processor = await AutoImageProcessor.from_pretrained('Xenova/clip-vit-base-patch16');
	const image = await load_image('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/artemis.jpeg');
	const { pixel_values } = await processor(image);
	```

	#### `AutoImageProcessor.from_pretrained(pretrained_model_name_or_path, options)`

	Instantiate one of the processor classes of the library from a pretrained model.

	The processor class to instantiate is selected based on the `image_processor_type` (or `feature_extractor_type`; legacy)
	property of the config object (either passed as an argument or loaded from `pretrained_model_name_or_path` if possible)

	Parameters

	- `pretrained_model_name_or_path` (`string`) — The name or path of the pretrained model. Can be either:
	- A string, the model id of a pretrained processor hosted inside a model repo on huggingface.co.
	Valid model ids can be located at the root-level, like `bert-base-uncased`, or namespaced under a
	user or organization name, like `dbmdz/bert-base-german-cased`.
	- A path to a directory containing processor files, e.g., `./my_model_directory/`.
	- `options` ([`PretrainedOptions`](./utils/hub#module_utils/hub.PretrainedOptions)) — Additional options for loading the processor.

	Returns: `Promise`<[`ImageProcessor`](./processors#module_processors.ImageProcessor)> — A new image processor instance.

	### AutoProcessor

	Loads a processor from a pretrained id. Unlike `AutoImageProcessor` and
	`AutoFeatureExtractor`, `AutoProcessor` returns a multi-modal [`Processor`](./processors#module_processors.Processor)
	that bundles together a tokenizer, image processor, and/or feature extractor
	— use it when a single model needs more than one.

	Example: Load a Whisper processor (tokenizer + audio feature extractor).
	```javascript
	import { AutoProcessor } from '@huggingface/transformers';
	const processor = await AutoProcessor.from_pretrained('onnx-community/whisper-tiny.en');
	```

	Example: Run an image through a CLIP processor.
	```javascript
	import { AutoProcessor, load_image } from '@huggingface/transformers';

	const processor = await AutoProcessor.from_pretrained('Xenova/clip-vit-base-patch16');
	const image = await load_image('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/football-match.jpg');
	const { pixel_values } = await processor(image);
	```

	#### `AutoProcessor.from_pretrained(pretrained_model_name_or_path, options)`

	Instantiate one of the processor classes of the library from a pretrained model.

	The processor class to instantiate is selected based on the `image_processor_type` (or `feature_extractor_type`; legacy)
	property of the config object (either passed as an argument or loaded from `pretrained_model_name_or_path` if possible)

	Parameters

	- `pretrained_model_name_or_path` (`string`) — The name or path of the pretrained model. Can be either:
	- A string, the model ID of a pretrained processor hosted inside a model repo on huggingface.co.
	Valid model IDs can be located at the root level, like `bert-base-uncased`, or namespaced under a
	user or organization name, like `dbmdz/bert-base-german-cased`.
	- A path to a directory containing processor files, e.g., `./my_model_directory/`.
	- `options` ([`PretrainedProcessorOptions`](./processors#module_processors.PretrainedProcessorOptions)) — Additional options for loading the processor.

	Returns: `Promise`<[`Processor`](./processors#module_processors.Processor)> — A new processor instance.

	### Processor

	Multi-modal preprocessor that delegates to the tokenizer, image processor,
	and/or feature extractor required by a model.

	#### `Processor(input, args)`

	Calls the feature_extractor function with the given input.

	Parameters

	- `input` (`any`) — The input to extract features from.
	- `args` (`...any`) — Additional arguments.

	Returns: `Promise`<`any`> — A Promise that resolves with the extracted features.

	#### `Processor.constructor(config, components, chat_template)`

	Create a processor from parsed config and its component preprocessors.

	Parameters

	- `config` (`Object`) — Processor configuration.
	- `components` (`Record`<`string`, `Object`>) — Loaded tokenizer, image processor, and/or feature extractor.
	- `chat_template` (`string` \| `null`) — Optional chat template loaded from the model repo.

	#### `Processor.apply_chat_template(messages, options)`

	Delegates to the underlying tokenizer's `apply_chat_template`.

	Parameters

	- `messages` ([`Message`](./tokenizers#module_tokenizers.Message)[])
	- `options` ([`ApplyChatTemplateOptions`](./tokenizers#module_tokenizers.ApplyChatTemplateOptions)<`TTokenize`, `TReturnTensor`, `TReturnDict`>)

	Returns: `ApplyChatTemplateReturn`<`TTokenize`, `TReturnTensor`, `TReturnDict`>

	#### `Processor.batch_decode(batch, decode_args)`

	Decode a batch of tokenized sequences via the underlying tokenizer.

	Parameters

	- `batch` (`number[][]` \| [`Tensor`](./utils/tensor#module_utils/tensor.Tensor)) — List/Tensor of tokenized input sequences.
	- `decode_args` (`Object`) — (Optional) Object with decoding arguments.

	Returns: `string[]`

	#### `Processor.decode(token_ids, [decode_args])`

	Decode a single tokenized sequence via the underlying tokenizer.

	Parameters

	- `token_ids` (`number[]` \| `bigint[]` \| [`Tensor`](./utils/tensor#module_utils/tensor.Tensor)) — List/Tensor of token IDs to decode.
	- `decode_args` (`Object`) _optional_ — defaults to `{}`
	- `skip_special_tokens` (`boolean`) _optional_ — defaults to `false` — If true, special tokens are removed from the output string.
	- `clean_up_tokenization_spaces` (`boolean`) _optional_ — defaults to `true` — If true, spaces before punctuation and abbreviated forms are removed.

	Returns: `string`

	#### `Processor.from_pretrained(pretrained_model_name_or_path, options)`

	Instantiate one of the processor classes of the library from a pretrained model.

	The processor class to instantiate is selected based on the `image_processor_type` (or `feature_extractor_type`; legacy)
	property of the config object (either passed as an argument or loaded from `pretrained_model_name_or_path` if possible)

	Parameters

	- `pretrained_model_name_or_path` (`string`) — The name or path of the pretrained model. Can be either:
	- A string, the model ID of a pretrained processor hosted inside a model repo on huggingface.co.
	Valid model IDs can be located at the root level, like `bert-base-uncased`, or namespaced under a
	user or organization name, like `dbmdz/bert-base-german-cased`.
	- A path to a directory containing processor files, e.g., `./my_model_directory/`.
	- `options` ([`PretrainedProcessorOptions`](./processors#module_processors.PretrainedProcessorOptions)) — Additional options for loading the processor.

	Returns: `Promise`<[`Processor`](./processors#module_processors.Processor)> — A new processor instance.

	## Type Definitions

	### HeightWidth

	Named tuple to indicate the order we are using is (height x width),
	even though the Graphics' industry standard is (width x height).

	_Type:_ [`height: number`, `width: number`]

	### ImageProcessorResult

	Properties

	- `pixel_values` ([`Tensor`](./utils/tensor#module_utils/tensor.Tensor)) — The pixel values of the batched preprocessed images.
	- `original_sizes` ([`HeightWidth`](./processors#module_processors.HeightWidth)[]) — Array of two-dimensional tuples like [[480, 640]].
	- `reshaped_input_sizes` ([`HeightWidth`](./processors#module_processors.HeightWidth)[]) — Array of two-dimensional tuples like [[1000, 1330]].

	### ImageProcessorConfig

	A configuration object used to create an image processor.

	Properties

	- `progress_callback` (`function`) _optional_ — defaults to `null` — If specified, this function is called during model construction with progress updates.
	- `image_mean` (`number[]`) _optional_ — The mean values for image normalization.
	- `image_std` (`number[]`) _optional_ — The standard deviation values for image normalization.
	- `do_rescale` (`boolean`) _optional_ — Whether to rescale the image pixel values to the [0,1] range.
	- `rescale_factor` (`number`) _optional_ — The factor to use for rescaling the image pixel values.
	- `do_normalize` (`boolean`) _optional_ — Whether to normalize the image pixel values.
	- `do_resize` (`boolean`) _optional_ — Whether to resize the image.
	- `resample` (`number`) _optional_ — What method to use for resampling.
	- `size` (`number` \| `Object`) _optional_ — The size to resize the image to.
	- `image_size` (`number` \| `Object`) _optional_ — The size to resize the image to (same as `size`).
	- `do_flip_channel_order` (`boolean`) _optional_ — defaults to `false` — Whether to flip the color channels from RGB to BGR.
	Can be overridden by the `do_flip_channel_order` parameter in the `preprocess` method.
	- `do_center_crop` (`boolean`) _optional_ — Whether to center crop the image to the specified `crop_size`.
	Can be overridden by `do_center_crop` in the `preprocess` method.
	- `do_thumbnail` (`boolean`) _optional_ — Whether to resize the image using thumbnail method.
	- `keep_aspect_ratio` (`boolean`) _optional_ — If `true`, the image is resized to the largest possible size such that the aspect ratio is preserved.
	Can be overridden by `keep_aspect_ratio` in `preprocess`.
	- `ensure_multiple_of` (`number`) _optional_ — If `do_resize` is `true`, the image is resized to a size that is a multiple of this value.
	Can be overridden by `ensure_multiple_of` in `preprocess`.
	- `mean` (`number[]`) _optional_ — The mean values for image normalization (same as `image_mean`).
	- `std` (`number[]`) _optional_ — The standard deviation values for image normalization (same as `image_std`).

	### PreprocessedImage

	Properties

	- `original_size` ([`HeightWidth`](./processors#module_processors.HeightWidth)) — The original size of the image.
	- `reshaped_input_size` ([`HeightWidth`](./processors#module_processors.HeightWidth)) — The reshaped input size of the image.
	- `pixel_values` ([`Tensor`](./utils/tensor#module_utils/tensor.Tensor)) — The pixel values of the preprocessed image.

	### ProcessorProperties

	Additional processor-specific properties.

	### PretrainedProcessorOptions

	_Type:_ [`PretrainedOptions`](./utils/hub#module_utils/hub.PretrainedOptions) & [`ProcessorProperties`](./processors#module_processors.ProcessorProperties)

Xet Storage Details

Size:: 20.6 kB
Xet hash:: 7a1e50afc689909170e504f6def01877a3b7e76196ad985c90b1b18599de9d77

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.