Buckets:

hf-doc-build
/

doc-dev

Files

xet

hf-doc-build/doc-dev / course /pr_1114 /en /chapter3 /1.md

rtrm

about 2 months ago

preview code

download

raw

2.43 kB

Introduction[[introduction]]

In Chapter 2 we explored how to use tokenizers and pretrained models to make predictions. But what if you want to fine-tune a pretrained model to solve a specific task? That's the topic of this chapter! You will learn:

How to prepare a large dataset from the Hub using the latest 🤗 Datasets features
How to use the high-level Trainer API to fine-tune a model with modern best practices
How to implement a custom training loop with optimization techniques
How to leverage the 🤗 Accelerate library to easily run distributed training on any setup
How to apply current fine-tuning best practices for maximum performance

📚 Essential Resources: Before starting, you might want to review the 🤗 Datasets documentation for data processing.

This chapter will also serve as an introduction to some Hugging Face libraries beyond the 🤗 Transformers library! We'll see how libraries like 🤗 Datasets, 🤗 Tokenizers, 🤗 Accelerate, and 🤗 Evaluate can help you train models more efficiently and effectively.

Each of the main sections in this chapter will teach you something different:

Section 2: Learn modern data preprocessing techniques and efficient dataset handling
Section 3: Master the powerful Trainer API with all its latest features
Section 4: Implement training loops from scratch and understand distributed training with Accelerate

By the end of this chapter, you'll be able to fine-tune models on your own datasets using both high-level APIs and custom training loops, applying the latest best practices in the field.

🎯 What You'll Build: By the end of this chapter, you'll have fine-tuned a BERT model for text classification and understand how to adapt the techniques to your own datasets and tasks.

This chapter focuses exclusively on PyTorch, as it has become the standard framework for modern deep learning research and production. We'll use the latest APIs and best practices from the Hugging Face ecosystem.

To upload your trained models to the Hugging Face Hub, you will need a Hugging Face account: create an account

Xet Storage Details

Size:: 2.43 kB
Xet hash:: 6cbfb15a8cebf95104ce5f5d87568bf70107108d970eb6f8c3a42774c541081b

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.