Buckets:

hf-doc-build
/

doc-dev

Files

xet

hf-doc-build/doc-dev / transformers.js /pr_1649 /en /api /processors.md

HuggingFaceDocBuilder

27 days ago

preview code

download

raw

6.96 kB

processors

Processors are used to prepare inputs (e.g., text, image or audio) for a model.

Example: Using a WhisperProcessor to prepare an audio input for a model.

import { AutoProcessor, read_audio } from '@huggingface/transformers';

const processor = await AutoProcessor.from_pretrained('openai/whisper-tiny.en');
const audio = await read_audio('https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac', 16000);
const { input_features } = await processor(audio);
// Tensor {
//   data: Float32Array(240000) [0.4752984642982483, 0.5597258806228638, 0.56434166431427, ...],
//   dims: [1, 80, 3000],
//   type: 'float32',
//   size: 240000,
// }

processors
- static
  - .Processor
    - new Processor(config, components, chat_template)
    - instance
      - .image_processor ⇒ ImageProcessor | undefined
      - .tokenizer ⇒ PreTrainedTokenizer | undefined
      - .feature_extractor ⇒ FeatureExtractor | undefined
      - .apply_chat_template(messages, options) ⇒ ReturnType.<PreTrainedTokenizer>
      - .batch_decode(...args) ⇒ ReturnType.<PreTrainedTokenizer>
      - .decode(...args) ⇒ ReturnType.<PreTrainedTokenizer>
      - ._call(input, ...args) ⇒ Promise.<any>
    - static
      - .from_pretrained(pretrained_model_name_or_path, options) ⇒ Promise.<Processor>
- inner
  - ~PreTrainedTokenizer : Object

processors.Processor

Represents a Processor that extracts features from an input.

Kind: static class of processors

.Processor
- new Processor(config, components, chat_template)
- instance
  - .image_processor ⇒ ImageProcessor | undefined
  - .tokenizer ⇒ PreTrainedTokenizer | undefined
  - .feature_extractor ⇒ FeatureExtractor | undefined
  - .apply_chat_template(messages, options) ⇒ ReturnType.<PreTrainedTokenizer>
  - .batch_decode(...args) ⇒ ReturnType.<PreTrainedTokenizer>
  - .decode(...args) ⇒ ReturnType.<PreTrainedTokenizer>
  - ._call(input, ...args) ⇒ Promise.<any>
- static
  - .from_pretrained(pretrained_model_name_or_path, options) ⇒ Promise.<Processor>

`new Processor(config, components, chat_template)`

Creates a new Processor with the given components

  ParamType




configObject

componentsRecord.&lt;string, Object&gt;

chat_templatestring

`processor.image_processor` ⇒ ImageProcessor | undefined

Kind: instance property of Processor
Returns: ImageProcessor | undefined - The image processor of the processor, if it exists.

`processor.tokenizer` ⇒ PreTrainedTokenizer | undefined

Kind: instance property of Processor
Returns: PreTrainedTokenizer | undefined - The tokenizer of the processor, if it exists.

`processor.feature_extractor` ⇒ FeatureExtractor | undefined

Kind: instance property of Processor
Returns: FeatureExtractor | undefined - The feature extractor of the processor, if it exists.

`processor.apply_chat_template(messages, options)` ⇒ ReturnType.<PreTrainedTokenizer>

Kind: instance method of Processor

  ParamType




messagesParameters

optionsParameters

`processor.batch_decode(...args)` ⇒ ReturnType.<PreTrainedTokenizer>

Kind: instance method of Processor

  ParamType




...argsParameters.&lt;PreTrainedTokenizer&gt;

`processor.decode(...args)` ⇒ ReturnType.<PreTrainedTokenizer>

Kind: instance method of Processor

  ParamType




...argsParameters.&lt;PreTrainedTokenizer&gt;

`processor._call(input, ...args)` ⇒ Promise.<any>

Calls the feature_extractor function with the given input.

Kind: instance method of Processor
Returns: Promise.<any> - A Promise that resolves with the extracted features.

  ParamTypeDescription




inputanyThe input to extract features from.


...argsanyAdditional arguments.

`Processor.from_pretrained(pretrained_model_name_or_path, options)` ⇒ Promise.<Processor>

Instantiate one of the processor classes of the library from a pretrained model.

The processor class to instantiate is selected based on the image_processor_type (or feature_extractor_type; legacy) property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible)

Kind: static method of Processor
Returns: Promise.<Processor> - A new instance of the Processor class.

  ParamTypeDescription




pretrained_model_name_or_pathstringThe name or path of the pretrained model. Can be either:

A string, the model id of a pretrained processor hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased. A path to a directory containing processor files, e.g., ./my_model_directory/.

optionsPretrainedProcessorOptionsAdditional options for loading the processor.

`processors~PreTrainedTokenizer` : Object

Additional processor-specific properties.

Kind: inner typedef of processors

Xet Storage Details

Size:: 6.96 kB
Xet hash:: d12b1fcd09f323200fce4bf0255e30593608695fe51a90c8bff035fd64bbfb8c

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.

processors

processors.Processor

new Processor(config, components, chat_template)

processor.image_processor ⇒ ImageProcessor | undefined

processor.tokenizer ⇒ PreTrainedTokenizer | undefined

processor.feature_extractor ⇒ FeatureExtractor | undefined

processor.apply_chat_template(messages, options) ⇒ ReturnType.<PreTrainedTokenizer>

processor.batch_decode(...args) ⇒ ReturnType.<PreTrainedTokenizer>

processor.decode(...args) ⇒ ReturnType.<PreTrainedTokenizer>

processor._call(input, ...args) ⇒ Promise.<any>

Processor.from_pretrained(pretrained_model_name_or_path, options) ⇒ Promise.<Processor>

processors~PreTrainedTokenizer : Object

Xet Storage Details

`new Processor(config, components, chat_template)`

`processor.image_processor` ⇒ ImageProcessor | undefined

`processor.tokenizer` ⇒ PreTrainedTokenizer | undefined

`processor.feature_extractor` ⇒ FeatureExtractor | undefined

`processor.apply_chat_template(messages, options)` ⇒ ReturnType.<PreTrainedTokenizer>

`processor.batch_decode(...args)` ⇒ ReturnType.<PreTrainedTokenizer>

`processor.decode(...args)` ⇒ ReturnType.<PreTrainedTokenizer>

`processor._call(input, ...args)` ⇒ Promise.<any>

`Processor.from_pretrained(pretrained_model_name_or_path, options)` ⇒ Promise.<Processor>

`processors~PreTrainedTokenizer` : Object