After 2 months of refinement, I'm happy to announce that a lot of Transformers' modeling code is now significantly more torch-compile & export-friendly ๐ฅ
Why it had to be done ๐ PyTorch's Dynamo compiler is increasingly becoming the default interoperability layer for ML systems. Anything that relies on torch.export or torch.compile, from model optimization to cross-framework integrations, benefits directly when models can be captured as a single dynamo-traced graph !
Transformers models are now easier to: โ๏ธ Compile end-to-end with torch.compile backends ๐ฆ Export reliably via torch.export and torch.onnx.export ๐ Deploy to ONNX / ONNX Runtime, Intel Corporation's OpenVINO, NVIDIA AutoDeploy (TRT-LLM), AMD's Quark, Meta's Executorch and more hardware-specific runtimes.
This work aims at unblocking entire TorchDynamo-based toolchains that rely on exporting Transformers across runtimes and accelerators.
We are doubling down on Transformers commitment to be a first-class citizen of the PyTorch ecosystem, more exportable, more optimizable, and easier to deploy everywhere.
There are definitely some edge-cases that we still haven't addressed so don't hesitate to try compiling / exporting your favorite transformers and to open issues / PRs.
PR in the comments ! More updates coming coming soon !
While Hugging Face offers extensive tutorials on classification and NLP tasks, there is very little guidance on performing regression tasks with Transformers. In my latest article, I provide a step-by-step guide to running regression using Hugging Face, applying it to financial news data to predict stock returns. In this tutorial, you will learn how to: -Prepare and preprocess textual and numerical data for regression -Configure a Transformer model for regression tasks -Apply the model to real-world financial datasets with fully reproducible code