Buckets:
Introduction[[introduction]]
In Chapter 2 we explored how to use tokenizers and pretrained models to make predictions. But what if you want to fine-tune a pretrained model to solve a specific task? That's the topic of this chapter! You will learn:
- How to prepare a large dataset from the Hub using the latest ๐ค Datasets features
- How to use the high-level
TrainerAPI to fine-tune a model with modern best practices - How to implement a custom training loop with optimization techniques
- How to leverage the ๐ค Accelerate library to easily run distributed training on any setup
- How to apply current fine-tuning best practices for maximum performance
๐ Essential Resources: Before starting, you might want to review the ๐ค Datasets documentation for data processing.
This chapter will also serve as an introduction to some Hugging Face libraries beyond the ๐ค Transformers library! We'll see how libraries like ๐ค Datasets, ๐ค Tokenizers, ๐ค Accelerate, and ๐ค Evaluate can help you train models more efficiently and effectively.
Each of the main sections in this chapter will teach you something different:
- Section 2: Learn modern data preprocessing techniques and efficient dataset handling
- Section 3: Master the powerful Trainer API with all its latest features
- Section 4: Implement training loops from scratch and understand distributed training with Accelerate
By the end of this chapter, you'll be able to fine-tune models on your own datasets using both high-level APIs and custom training loops, applying the latest best practices in the field.
๐ฏ What You'll Build: By the end of this chapter, you'll have fine-tuned a BERT model for text classification and understand how to adapt the techniques to your own datasets and tasks.
This chapter focuses exclusively on PyTorch, as it has become the standard framework for modern deep learning research and production. We'll use the latest APIs and best practices from the Hugging Face ecosystem.
To upload your trained models to the Hugging Face Hub, you will need a Hugging Face account: create an account
Xet Storage Details
- Size:
- 2.43 kB
- Xet hash:
- 6cbfb15a8cebf95104ce5f5d87568bf70107108d970eb6f8c3a42774c541081b
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.