Buckets:
processors
Processors turn raw inputs (images, audio, text) into the tensor shapes a model expects. Pipelines pick the right processor automatically; call one directly only when you need to preprocess without running inference.
Three Auto* entry points cover the common cases:
AutoProcessor— multi-modal (tokenizer + image/audio), e.g. Whisper, CLIP.AutoImageProcessor— vision-only models.AutoFeatureExtractor— audio-only models.
Example: Prepare audio for Whisper.
import { AutoProcessor, load_audio } from '@huggingface/transformers';
const processor = await AutoProcessor.from_pretrained('onnx-community/whisper-tiny.en');
const audio = await load_audio('https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac', 16000);
const { input_features } = await processor(audio);
// Tensor {
// data: Float32Array(240000) [0.4752984642982483, 0.5597258806228638, 0.56434166431427, ...],
// dims: [1, 80, 3000],
// type: 'float32',
// size: 240000,
// }
On this page
Classes — FeatureExtractor · ImageProcessor · AutoFeatureExtractor · AutoImageProcessor · AutoProcessor · Processor
Classes
FeatureExtractor
Base class for audio feature extractors.
FeatureExtractor.constructor(config)
Create a feature extractor from a parsed preprocessor_config.json.
Parameters
config(Object) — The configuration for the feature extractor.
FeatureExtractor.from_pretrained(pretrained_model_name_or_path, options)
Instantiate one of the feature extractor classes of the library from a pretrained model.
The feature extractor class to instantiate is selected based on the feature_extractor_type property of
the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible)
Parameters
pretrained_model_name_or_path(string) — The name or path of the pretrained model. Can be either:- A string, the model ID of a pretrained feature extractor hosted inside a model repo on huggingface.co.
Valid model IDs can be located at the root level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased. - A path to a directory containing feature_extractor files, e.g.,
./my_model_directory/.
- A string, the model ID of a pretrained feature extractor hosted inside a model repo on huggingface.co.
Valid model IDs can be located at the root level, like
options(PretrainedOptions) — Additional options for loading the feature_extractor.
Returns: Promise<FeatureExtractor> — A new feature extractor instance.
ImageProcessor
Base class for image processors.
ImageProcessor(images, args)
Preprocess one or more images and batch the result into pixel_values.
Parameters
images(RawImage[]?) — The image or images to preprocess.args(...any) — Additional arguments.
Returns: Promise<ImageProcessorResult> — An object containing the concatenated pixel values (and other metadata) of the preprocessed images.
ImageProcessor.constructor(config)
Create an image processor from a parsed preprocessor_config.json.
Parameters
config(ImageProcessorConfig) — The configuration object.
ImageProcessor.thumbnail(image, size, [resample])
Resize the image to make a thumbnail. The image is resized so that no dimension is larger than any corresponding dimension of the specified size.
Parameters
image(RawImage) — The image to be resized.size({height:number, width:number}) — The size{"height": h, "width": w}to resize the image to.resample(string|0|1|2|3|4|5) optional — defaults to2— The resampling filter to use.
Returns: Promise<RawImage> — The resized image.
ImageProcessor.crop_margin(image, gray_threshold)
Crops the margin of the image. Gray pixels are considered margin (i.e., pixels with a value below the threshold).
Parameters
image(RawImage) — The image to be cropped.gray_threshold(number) — Value below which pixels are considered to be gray.
Returns: Promise<RawImage> — The cropped image.
ImageProcessor.pad_image(pixelData, imgDims, padSize, options)
Pad the image by a certain amount.
Parameters
pixelData(Float32Array) — The pixel data to pad.imgDims(number[]) — The dimensions of the image (height, width, channels).padSize({width:number; height:number}|number|'square') — The dimensions of the padded image.options(Object) — The options for padding.mode('constant'|'symmetric') optional — defaults to'constant'— The type of padding to add.center(boolean) optional — defaults tofalse— Whether to center the image.constant_values(number[]?) optional — defaults to0— The constant value to use for padding.
Returns: [Float32Array, number[]] — The padded pixel data and image dimensions.
ImageProcessor.rescale(pixelData)
Rescale the image pixel values by this.rescale_factor.
Parameters
pixelData(Float32Array) — The pixel data to rescale.
Returns: void
ImageProcessor.get_resize_output_image_size(image, size)
Find the target (width, height) dimension of the output image after resizing given the input image and the desired size.
Parameters
image(RawImage) — The image to resize.size(any) — The size to use for resizing the image.
Returns: [number, number] — The target (width, height) dimension of the output image after resizing.
ImageProcessor.resize(image)
Resizes the image.
Parameters
image(RawImage) — The image to resize.
Returns: Promise<RawImage> — The resized image.
ImageProcessor.preprocess(image, overrides)
Preprocesses the given image.
Parameters
image(RawImage) — The image to preprocess.overrides(Object) — The overrides for the preprocessing options.
Returns: Promise<PreprocessedImage> — The preprocessed image.
ImageProcessor.from_pretrained(pretrained_model_name_or_path, options)
Instantiate one of the processor classes of the library from a pretrained model.
The processor class to instantiate is selected based on the image_processor_type (or feature_extractor_type; legacy)
property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible)
Parameters
pretrained_model_name_or_path(string) — The name or path of the pretrained model. Can be either:- A string, the model id of a pretrained processor hosted inside a model repo on huggingface.co.
Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased. - A path to a directory containing processor files, e.g.,
./my_model_directory/.
- A string, the model id of a pretrained processor hosted inside a model repo on huggingface.co.
Valid model ids can be located at the root-level, like
options(PretrainedOptions) — Additional options for loading the processor.
Returns: Promise<ImageProcessor> — A new image processor instance.
AutoFeatureExtractor
Loads a feature extractor from a pretrained id. The concrete class is
selected from the feature_extractor_type in preprocessor_config.json.
Most commonly used for audio models.
import { AutoFeatureExtractor, load_audio } from '@huggingface/transformers';
const extractor = await AutoFeatureExtractor.from_pretrained('onnx-community/whisper-tiny.en');
const audio = await load_audio('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/jfk.wav', 16000);
const { input_features } = await extractor(audio);
AutoFeatureExtractor.from_pretrained(pretrained_model_name_or_path, options)
Instantiate one of the feature extractor classes of the library from a pretrained model.
The feature extractor class to instantiate is selected based on the feature_extractor_type property of
the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible)
Parameters
pretrained_model_name_or_path(string) — The name or path of the pretrained model. Can be either:- A string, the model ID of a pretrained feature extractor hosted inside a model repo on huggingface.co.
Valid model IDs can be located at the root level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased. - A path to a directory containing feature_extractor files, e.g.,
./my_model_directory/.
- A string, the model ID of a pretrained feature extractor hosted inside a model repo on huggingface.co.
Valid model IDs can be located at the root level, like
options(PretrainedOptions) — Additional options for loading the feature_extractor.
Returns: Promise<FeatureExtractor> — A new feature extractor instance.
AutoImageProcessor
Loads an image processor from a pretrained id. The concrete class is
selected from the image_processor_type in preprocessor_config.json.
import { AutoImageProcessor, load_image } from '@huggingface/transformers';
const processor = await AutoImageProcessor.from_pretrained('Xenova/clip-vit-base-patch16');
const image = await load_image('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/artemis.jpeg');
const { pixel_values } = await processor(image);
AutoImageProcessor.from_pretrained(pretrained_model_name_or_path, options)
Instantiate one of the processor classes of the library from a pretrained model.
The processor class to instantiate is selected based on the image_processor_type (or feature_extractor_type; legacy)
property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible)
Parameters
pretrained_model_name_or_path(string) — The name or path of the pretrained model. Can be either:- A string, the model id of a pretrained processor hosted inside a model repo on huggingface.co.
Valid model ids can be located at the root-level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased. - A path to a directory containing processor files, e.g.,
./my_model_directory/.
- A string, the model id of a pretrained processor hosted inside a model repo on huggingface.co.
Valid model ids can be located at the root-level, like
options(PretrainedOptions) — Additional options for loading the processor.
Returns: Promise<ImageProcessor> — A new image processor instance.
AutoProcessor
Loads a processor from a pretrained id. Unlike AutoImageProcessor and
AutoFeatureExtractor, AutoProcessor returns a multi-modal Processor
that bundles together a tokenizer, image processor, and/or feature extractor
— use it when a single model needs more than one.
Example: Load a Whisper processor (tokenizer + audio feature extractor).
import { AutoProcessor } from '@huggingface/transformers';
const processor = await AutoProcessor.from_pretrained('onnx-community/whisper-tiny.en');
Example: Run an image through a CLIP processor.
import { AutoProcessor, load_image } from '@huggingface/transformers';
const processor = await AutoProcessor.from_pretrained('Xenova/clip-vit-base-patch16');
const image = await load_image('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/football-match.jpg');
const { pixel_values } = await processor(image);
AutoProcessor.from_pretrained(pretrained_model_name_or_path, options)
Instantiate one of the processor classes of the library from a pretrained model.
The processor class to instantiate is selected based on the image_processor_type (or feature_extractor_type; legacy)
property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible)
Parameters
pretrained_model_name_or_path(string) — The name or path of the pretrained model. Can be either:- A string, the model ID of a pretrained processor hosted inside a model repo on huggingface.co.
Valid model IDs can be located at the root level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased. - A path to a directory containing processor files, e.g.,
./my_model_directory/.
- A string, the model ID of a pretrained processor hosted inside a model repo on huggingface.co.
Valid model IDs can be located at the root level, like
options(PretrainedProcessorOptions) — Additional options for loading the processor.
Returns: Promise<Processor> — A new processor instance.
Processor
Multi-modal preprocessor that delegates to the tokenizer, image processor, and/or feature extractor required by a model.
Processor(input, args)
Calls the feature_extractor function with the given input.
Parameters
input(any) — The input to extract features from.args(...any) — Additional arguments.
Returns: Promise<any> — A Promise that resolves with the extracted features.
Processor.constructor(config, components, chat_template)
Create a processor from parsed config and its component preprocessors.
Parameters
config(Object) — Processor configuration.components(Record<string,Object>) — Loaded tokenizer, image processor, and/or feature extractor.chat_template(string|null) — Optional chat template loaded from the model repo.
Processor.apply_chat_template(messages, options)
Delegates to the underlying tokenizer's apply_chat_template.
Parameters
messages(Message[])options(ApplyChatTemplateOptions<TTokenize,TReturnTensor,TReturnDict>)
Returns: ApplyChatTemplateReturn<TTokenize, TReturnTensor, TReturnDict>
Processor.batch_decode(batch, decode_args)
Decode a batch of tokenized sequences via the underlying tokenizer.
Parameters
batch(number[][]|Tensor) — List/Tensor of tokenized input sequences.decode_args(Object) — (Optional) Object with decoding arguments.
Returns: string[]
Processor.decode(token_ids, [decode_args])
Decode a single tokenized sequence via the underlying tokenizer.
Parameters
token_ids(number[]|bigint[]|Tensor) — List/Tensor of token IDs to decode.decode_args(Object) optional — defaults to{}skip_special_tokens(boolean) optional — defaults tofalse— If true, special tokens are removed from the output string.clean_up_tokenization_spaces(boolean) optional — defaults totrue— If true, spaces before punctuation and abbreviated forms are removed.
Returns: string
Processor.from_pretrained(pretrained_model_name_or_path, options)
Instantiate one of the processor classes of the library from a pretrained model.
The processor class to instantiate is selected based on the image_processor_type (or feature_extractor_type; legacy)
property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible)
Parameters
pretrained_model_name_or_path(string) — The name or path of the pretrained model. Can be either:- A string, the model ID of a pretrained processor hosted inside a model repo on huggingface.co.
Valid model IDs can be located at the root level, like
bert-base-uncased, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased. - A path to a directory containing processor files, e.g.,
./my_model_directory/.
- A string, the model ID of a pretrained processor hosted inside a model repo on huggingface.co.
Valid model IDs can be located at the root level, like
options(PretrainedProcessorOptions) — Additional options for loading the processor.
Returns: Promise<Processor> — A new processor instance.
Type Definitions
HeightWidth
Named tuple to indicate the order we are using is (height x width), even though the Graphics' industry standard is (width x height).
Type: [height: number, width: number]
ImageProcessorResult
Properties
pixel_values(Tensor) — The pixel values of the batched preprocessed images.original_sizes(HeightWidth[]) — Array of two-dimensional tuples like [[480, 640]].reshaped_input_sizes(HeightWidth[]) — Array of two-dimensional tuples like [[1000, 1330]].
ImageProcessorConfig
A configuration object used to create an image processor.
Properties
progress_callback(function) optional — defaults tonull— If specified, this function is called during model construction with progress updates.image_mean(number[]) optional — The mean values for image normalization.image_std(number[]) optional — The standard deviation values for image normalization.do_rescale(boolean) optional — Whether to rescale the image pixel values to the [0,1] range.rescale_factor(number) optional — The factor to use for rescaling the image pixel values.do_normalize(boolean) optional — Whether to normalize the image pixel values.do_resize(boolean) optional — Whether to resize the image.resample(number) optional — What method to use for resampling.size(number|Object) optional — The size to resize the image to.image_size(number|Object) optional — The size to resize the image to (same assize).do_flip_channel_order(boolean) optional — defaults tofalse— Whether to flip the color channels from RGB to BGR. Can be overridden by thedo_flip_channel_orderparameter in thepreprocessmethod.do_center_crop(boolean) optional — Whether to center crop the image to the specifiedcrop_size. Can be overridden bydo_center_cropin thepreprocessmethod.do_thumbnail(boolean) optional — Whether to resize the image using thumbnail method.keep_aspect_ratio(boolean) optional — Iftrue, the image is resized to the largest possible size such that the aspect ratio is preserved. Can be overridden bykeep_aspect_ratioinpreprocess.ensure_multiple_of(number) optional — Ifdo_resizeistrue, the image is resized to a size that is a multiple of this value. Can be overridden byensure_multiple_ofinpreprocess.mean(number[]) optional — The mean values for image normalization (same asimage_mean).std(number[]) optional — The standard deviation values for image normalization (same asimage_std).
PreprocessedImage
Properties
original_size(HeightWidth) — The original size of the image.reshaped_input_size(HeightWidth) — The reshaped input size of the image.pixel_values(Tensor) — The pixel values of the preprocessed image.
ProcessorProperties
Additional processor-specific properties.
PretrainedProcessorOptions
Type: PretrainedOptions & ProcessorProperties
Xet Storage Details
- Size:
- 20.6 kB
- Xet hash:
- 7a1e50afc689909170e504f6def01877a3b7e76196ad985c90b1b18599de9d77
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.