Buckets:

|
download
raw
20.6 kB

processors

Processors turn raw inputs (images, audio, text) into the tensor shapes a model expects. Pipelines pick the right processor automatically; call one directly only when you need to preprocess without running inference.

Three Auto* entry points cover the common cases:

  • AutoProcessor — multi-modal (tokenizer + image/audio), e.g. Whisper, CLIP.
  • AutoImageProcessor — vision-only models.
  • AutoFeatureExtractor — audio-only models.

Example: Prepare audio for Whisper.

import { AutoProcessor, load_audio } from '@huggingface/transformers';

const processor = await AutoProcessor.from_pretrained('onnx-community/whisper-tiny.en');
const audio = await load_audio('https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac', 16000);
const { input_features } = await processor(audio);
// Tensor {
//   data: Float32Array(240000) [0.4752984642982483, 0.5597258806228638, 0.56434166431427, ...],
//   dims: [1, 80, 3000],
//   type: 'float32',
//   size: 240000,
// }

On this page

ClassesFeatureExtractor · ImageProcessor · AutoFeatureExtractor · AutoImageProcessor · AutoProcessor · Processor

Classes

FeatureExtractor

Base class for audio feature extractors.

FeatureExtractor.constructor(config)

Create a feature extractor from a parsed preprocessor_config.json.

Parameters

  • config (Object) — The configuration for the feature extractor.

FeatureExtractor.from_pretrained(pretrained_model_name_or_path, options)

Instantiate one of the feature extractor classes of the library from a pretrained model.

The feature extractor class to instantiate is selected based on the feature_extractor_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible)

Parameters

  • pretrained_model_name_or_path (string) — The name or path of the pretrained model. Can be either:
    • A string, the model ID of a pretrained feature extractor hosted inside a model repo on huggingface.co. Valid model IDs can be located at the root level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
    • A path to a directory containing feature_extractor files, e.g., ./my_model_directory/.
  • options (PretrainedOptions) — Additional options for loading the feature_extractor.

Returns: Promise<FeatureExtractor> — A new feature extractor instance.

ImageProcessor

Base class for image processors.

ImageProcessor(images, args)

Preprocess one or more images and batch the result into pixel_values.

Parameters

  • images (RawImage[]?) — The image or images to preprocess.
  • args (...any) — Additional arguments.

Returns: Promise<ImageProcessorResult> — An object containing the concatenated pixel values (and other metadata) of the preprocessed images.

ImageProcessor.constructor(config)

Create an image processor from a parsed preprocessor_config.json.

Parameters

ImageProcessor.thumbnail(image, size, [resample])

Resize the image to make a thumbnail. The image is resized so that no dimension is larger than any corresponding dimension of the specified size.

Parameters

  • image (RawImage) — The image to be resized.
  • size ({height:number, width:number}) — The size {"height": h, "width": w} to resize the image to.
  • resample (string | 0 | 1 | 2 | 3 | 4 | 5) optional — defaults to 2 — The resampling filter to use.

Returns: Promise<RawImage> — The resized image.

ImageProcessor.crop_margin(image, gray_threshold)

Crops the margin of the image. Gray pixels are considered margin (i.e., pixels with a value below the threshold).

Parameters

  • image (RawImage) — The image to be cropped.
  • gray_threshold (number) — Value below which pixels are considered to be gray.

Returns: Promise<RawImage> — The cropped image.

ImageProcessor.pad_image(pixelData, imgDims, padSize, options)

Pad the image by a certain amount.

Parameters

  • pixelData (Float32Array) — The pixel data to pad.
  • imgDims (number[]) — The dimensions of the image (height, width, channels).
  • padSize ({width:number; height:number} | number | 'square') — The dimensions of the padded image.
  • options (Object) — The options for padding.
    • mode ('constant' | 'symmetric') optional — defaults to 'constant' — The type of padding to add.
    • center (boolean) optional — defaults to false — Whether to center the image.
    • constant_values (number[]?) optional — defaults to 0 — The constant value to use for padding.

Returns: [Float32Array, number[]] — The padded pixel data and image dimensions.

ImageProcessor.rescale(pixelData)

Rescale the image pixel values by this.rescale_factor.

Parameters

  • pixelData (Float32Array) — The pixel data to rescale.

Returns: void

ImageProcessor.get_resize_output_image_size(image, size)

Find the target (width, height) dimension of the output image after resizing given the input image and the desired size.

Parameters

  • image (RawImage) — The image to resize.
  • size (any) — The size to use for resizing the image.

Returns: [number, number] — The target (width, height) dimension of the output image after resizing.

ImageProcessor.resize(image)

Resizes the image.

Parameters

  • image (RawImage) — The image to resize.

Returns: Promise<RawImage> — The resized image.

ImageProcessor.preprocess(image, overrides)

Preprocesses the given image.

Parameters

  • image (RawImage) — The image to preprocess.
  • overrides (Object) — The overrides for the preprocessing options.

Returns: Promise<PreprocessedImage> — The preprocessed image.

ImageProcessor.from_pretrained(pretrained_model_name_or_path, options)

Instantiate one of the processor classes of the library from a pretrained model.

The processor class to instantiate is selected based on the image_processor_type (or feature_extractor_type; legacy) property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible)

Parameters

  • pretrained_model_name_or_path (string) — The name or path of the pretrained model. Can be either:
    • A string, the model id of a pretrained processor hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
    • A path to a directory containing processor files, e.g., ./my_model_directory/.
  • options (PretrainedOptions) — Additional options for loading the processor.

Returns: Promise<ImageProcessor> — A new image processor instance.

AutoFeatureExtractor

Loads a feature extractor from a pretrained id. The concrete class is selected from the feature_extractor_type in preprocessor_config.json. Most commonly used for audio models.

import { AutoFeatureExtractor, load_audio } from '@huggingface/transformers';

const extractor = await AutoFeatureExtractor.from_pretrained('onnx-community/whisper-tiny.en');
const audio = await load_audio('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/jfk.wav', 16000);
const { input_features } = await extractor(audio);

AutoFeatureExtractor.from_pretrained(pretrained_model_name_or_path, options)

Instantiate one of the feature extractor classes of the library from a pretrained model.

The feature extractor class to instantiate is selected based on the feature_extractor_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible)

Parameters

  • pretrained_model_name_or_path (string) — The name or path of the pretrained model. Can be either:
    • A string, the model ID of a pretrained feature extractor hosted inside a model repo on huggingface.co. Valid model IDs can be located at the root level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
    • A path to a directory containing feature_extractor files, e.g., ./my_model_directory/.
  • options (PretrainedOptions) — Additional options for loading the feature_extractor.

Returns: Promise<FeatureExtractor> — A new feature extractor instance.

AutoImageProcessor

Loads an image processor from a pretrained id. The concrete class is selected from the image_processor_type in preprocessor_config.json.

import { AutoImageProcessor, load_image } from '@huggingface/transformers';

const processor = await AutoImageProcessor.from_pretrained('Xenova/clip-vit-base-patch16');
const image = await load_image('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/artemis.jpeg');
const { pixel_values } = await processor(image);

AutoImageProcessor.from_pretrained(pretrained_model_name_or_path, options)

Instantiate one of the processor classes of the library from a pretrained model.

The processor class to instantiate is selected based on the image_processor_type (or feature_extractor_type; legacy) property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible)

Parameters

  • pretrained_model_name_or_path (string) — The name or path of the pretrained model. Can be either:
    • A string, the model id of a pretrained processor hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
    • A path to a directory containing processor files, e.g., ./my_model_directory/.
  • options (PretrainedOptions) — Additional options for loading the processor.

Returns: Promise<ImageProcessor> — A new image processor instance.

AutoProcessor

Loads a processor from a pretrained id. Unlike AutoImageProcessor and AutoFeatureExtractor, AutoProcessor returns a multi-modal Processor that bundles together a tokenizer, image processor, and/or feature extractor — use it when a single model needs more than one.

Example: Load a Whisper processor (tokenizer + audio feature extractor).

import { AutoProcessor } from '@huggingface/transformers';
const processor = await AutoProcessor.from_pretrained('onnx-community/whisper-tiny.en');

Example: Run an image through a CLIP processor.

import { AutoProcessor, load_image } from '@huggingface/transformers';

const processor = await AutoProcessor.from_pretrained('Xenova/clip-vit-base-patch16');
const image = await load_image('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/football-match.jpg');
const { pixel_values } = await processor(image);

AutoProcessor.from_pretrained(pretrained_model_name_or_path, options)

Instantiate one of the processor classes of the library from a pretrained model.

The processor class to instantiate is selected based on the image_processor_type (or feature_extractor_type; legacy) property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible)

Parameters

  • pretrained_model_name_or_path (string) — The name or path of the pretrained model. Can be either:
    • A string, the model ID of a pretrained processor hosted inside a model repo on huggingface.co. Valid model IDs can be located at the root level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
    • A path to a directory containing processor files, e.g., ./my_model_directory/.
  • options (PretrainedProcessorOptions) — Additional options for loading the processor.

Returns: Promise<Processor> — A new processor instance.

Processor

Multi-modal preprocessor that delegates to the tokenizer, image processor, and/or feature extractor required by a model.

Processor(input, args)

Calls the feature_extractor function with the given input.

Parameters

  • input (any) — The input to extract features from.
  • args (...any) — Additional arguments.

Returns: Promise<any> — A Promise that resolves with the extracted features.

Processor.constructor(config, components, chat_template)

Create a processor from parsed config and its component preprocessors.

Parameters

  • config (Object) — Processor configuration.
  • components (Record<string, Object>) — Loaded tokenizer, image processor, and/or feature extractor.
  • chat_template (string | null) — Optional chat template loaded from the model repo.

Processor.apply_chat_template(messages, options)

Delegates to the underlying tokenizer's apply_chat_template.

Parameters

Returns: ApplyChatTemplateReturn<TTokenize, TReturnTensor, TReturnDict>

Processor.batch_decode(batch, decode_args)

Decode a batch of tokenized sequences via the underlying tokenizer.

Parameters

  • batch (number[][] | Tensor) — List/Tensor of tokenized input sequences.
  • decode_args (Object) — (Optional) Object with decoding arguments.

Returns: string[]

Processor.decode(token_ids, [decode_args])

Decode a single tokenized sequence via the underlying tokenizer.

Parameters

  • token_ids (number[] | bigint[] | Tensor) — List/Tensor of token IDs to decode.
  • decode_args (Object) optional — defaults to {}
    • skip_special_tokens (boolean) optional — defaults to false — If true, special tokens are removed from the output string.
    • clean_up_tokenization_spaces (boolean) optional — defaults to true — If true, spaces before punctuation and abbreviated forms are removed.

Returns: string

Processor.from_pretrained(pretrained_model_name_or_path, options)

Instantiate one of the processor classes of the library from a pretrained model.

The processor class to instantiate is selected based on the image_processor_type (or feature_extractor_type; legacy) property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible)

Parameters

  • pretrained_model_name_or_path (string) — The name or path of the pretrained model. Can be either:
    • A string, the model ID of a pretrained processor hosted inside a model repo on huggingface.co. Valid model IDs can be located at the root level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
    • A path to a directory containing processor files, e.g., ./my_model_directory/.
  • options (PretrainedProcessorOptions) — Additional options for loading the processor.

Returns: Promise<Processor> — A new processor instance.

Type Definitions

HeightWidth

Named tuple to indicate the order we are using is (height x width), even though the Graphics' industry standard is (width x height).

Type: [height: number, width: number]

ImageProcessorResult

Properties

  • pixel_values (Tensor) — The pixel values of the batched preprocessed images.
  • original_sizes (HeightWidth[]) — Array of two-dimensional tuples like [[480, 640]].
  • reshaped_input_sizes (HeightWidth[]) — Array of two-dimensional tuples like [[1000, 1330]].

ImageProcessorConfig

A configuration object used to create an image processor.

Properties

  • progress_callback (function) optional — defaults to null — If specified, this function is called during model construction with progress updates.
  • image_mean (number[]) optional — The mean values for image normalization.
  • image_std (number[]) optional — The standard deviation values for image normalization.
  • do_rescale (boolean) optional — Whether to rescale the image pixel values to the [0,1] range.
  • rescale_factor (number) optional — The factor to use for rescaling the image pixel values.
  • do_normalize (boolean) optional — Whether to normalize the image pixel values.
  • do_resize (boolean) optional — Whether to resize the image.
  • resample (number) optional — What method to use for resampling.
  • size (number | Object) optional — The size to resize the image to.
  • image_size (number | Object) optional — The size to resize the image to (same as size).
  • do_flip_channel_order (boolean) optional — defaults to false — Whether to flip the color channels from RGB to BGR. Can be overridden by the do_flip_channel_order parameter in the preprocess method.
  • do_center_crop (boolean) optional — Whether to center crop the image to the specified crop_size. Can be overridden by do_center_crop in the preprocess method.
  • do_thumbnail (boolean) optional — Whether to resize the image using thumbnail method.
  • keep_aspect_ratio (boolean) optional — If true, the image is resized to the largest possible size such that the aspect ratio is preserved. Can be overridden by keep_aspect_ratio in preprocess.
  • ensure_multiple_of (number) optional — If do_resize is true, the image is resized to a size that is a multiple of this value. Can be overridden by ensure_multiple_of in preprocess.
  • mean (number[]) optional — The mean values for image normalization (same as image_mean).
  • std (number[]) optional — The standard deviation values for image normalization (same as image_std).

PreprocessedImage

Properties

  • original_size (HeightWidth) — The original size of the image.
  • reshaped_input_size (HeightWidth) — The reshaped input size of the image.
  • pixel_values (Tensor) — The pixel values of the preprocessed image.

ProcessorProperties

Additional processor-specific properties.

PretrainedProcessorOptions

Type: PretrainedOptions & ProcessorProperties

Xet Storage Details

Size:
20.6 kB
·
Xet hash:
7a1e50afc689909170e504f6def01877a3b7e76196ad985c90b1b18599de9d77

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.