Buckets:

rtrm's picture
|
download
raw
5.63 kB

ExecuTorch

ExecuTorch is an end-to-end solution for enabling on-device inference capabilities across mobile and edge devices including wearables, embedded devices and microcontrollers. It is part of the PyTorch ecosystem and supports the deployment of PyTorch models with a focus on portability, productivity, and performance.

ExecuTorch introduces well defined entry points to perform model, device, and/or use-case specific optimizations such as backend delegation, user-defined compiler transformations, memory planning, and more. The first step in preparing a PyTorch model for execution on an edge device using ExecuTorch is to export the model. This is achieved through the use of a PyTorch API called torch.export.

ExecuTorch Integration[[transformers.TorchExportableModuleWithStaticCache]]

An integration point is being developed to ensure that 🤗 Transformers can be exported using torch.export. The goal of this integration is not only to enable export but also to ensure that the exported artifact can be further lowered and optimized to run efficiently in ExecuTorch, particularly for mobile and edge use cases.

class transformers.TorchExportableModuleWithStaticCachetransformers.TorchExportableModuleWithStaticCachehttps://github.com/huggingface/transformers/blob/vr_33892/src/transformers/integrations/executorch.py#L463[{"name": "model", "val": ": PreTrainedModel"}, {"name": "batch_size", "val": ": typing.Optional[int] = None"}, {"name": "max_cache_len", "val": ": typing.Optional[int] = None"}, {"name": "device", "val": ": typing.Optional[torch.device] = None"}]

A recipe module designed to make a PreTrainedModel exportable with torch.export, specifically for decoder-only LM to StaticCache. This module ensures that the exported model is compatible with further lowering and execution in ExecuTorch.

Note: This class is specifically designed to support export process using torch.export in a way that ensures the model can be further lowered and run efficiently in ExecuTorch.

forwardtransformers.TorchExportableModuleWithStaticCache.forwardhttps://github.com/huggingface/transformers/blob/vr_33892/src/transformers/integrations/executorch.py#L548[{"name": "input_ids", "val": ": typing.Optional[torch.LongTensor] = None"}, {"name": "inputs_embeds", "val": ": typing.Optional[torch.Tensor] = None"}, {"name": "cache_position", "val": ": typing.Optional[torch.Tensor] = None"}]- input_ids (torch.Tensor) -- Tensor representing current input token id to the module.

  • inputs_embeds (torch.Tensor) -- Tensor representing current input embeddings to the module.
  • cache_position (torch.Tensor) -- Tensor representing current input position in the cache.0torch.TensorLogits output from the model.

Forward pass of the module, which is compatible with the ExecuTorch runtime.

This forward adapter serves two primary purposes:

  1. Making the Model torch.export-Compatible: The adapter hides unsupported objects, such as the Cache, from the graph inputs and outputs, enabling the model to be exportable using torch.export without encountering issues.

  2. Ensuring Compatibility with ExecuTorch runtime: The adapter matches the model's forward signature with that in executorch/extension/llm/runner, ensuring that the exported model can be executed in ExecuTorch out-of-the-box.

transformers.convert_and_export_with_cachetransformers.convert_and_export_with_cachehttps://github.com/huggingface/transformers/blob/vr_33892/src/transformers/integrations/executorch.py#L747[{"name": "model", "val": ": PreTrainedModel"}, {"name": "example_input_ids", "val": ": typing.Optional[torch.Tensor] = None"}, {"name": "example_cache_position", "val": ": typing.Optional[torch.Tensor] = None"}, {"name": "dynamic_shapes", "val": ": typing.Optional[dict] = None"}, {"name": "strict", "val": ": typing.Optional[bool] = None"}]- model (PreTrainedModel) -- The pretrained model to be exported.

  • example_input_ids (Optional[torch.Tensor]) -- Example input token id used by torch.export.
  • example_cache_position (Optional[torch.Tensor]) -- Example current cache position used by torch.export.
  • dynamic_shapes(Optional[dict]) -- Dynamic shapes used by torch.export.
  • strict(Optional[bool]) -- Flag to instruct torch.export to use torchdynamo.0Exported program (torch.export.ExportedProgram)The exported program generated via torch.export.

Convert a PreTrainedModel into an exportable module and export it using torch.export, ensuring the exported model is compatible with ExecuTorch.

Xet Storage Details

Size:
5.63 kB
·
Xet hash:
5ff58f89df90cd6daa79bcbf7a1c057ba50e5d2ac8f5d6e8dfe919e38a6158b6

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.