Buckets:
| # DeepSpeed | |
| [DeepSpeed](https://github.com/deepspeedai/DeepSpeed), powered by Zero Redundancy Optimizer (ZeRO), is an optimization library for training and fitting very large models onto a GPU. It is available in several ZeRO stages, where each stage progressively saves more GPU memory by partitioning the optimizer state, gradients, parameters, and enabling offloading to a CPU or NVMe. DeepSpeed is integrated with the [Trainer](/docs/transformers/pr_33892/en/main_classes/trainer#transformers.Trainer) class and most of the setup is automatically taken care of for you. | |
| However, if you want to use DeepSpeed without the [Trainer](/docs/transformers/pr_33892/en/main_classes/trainer#transformers.Trainer), Transformers provides a `HfDeepSpeedConfig` class. | |
| <Tip> | |
| Learn more about using DeepSpeed with [Trainer](/docs/transformers/pr_33892/en/main_classes/trainer#transformers.Trainer) in the [DeepSpeed](../deepspeed) guide. | |
| </Tip> | |
| ## HfDeepSpeedConfig[[transformers.integrations.HfDeepSpeedConfig]] | |
| <div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8"> | |
| <docstring><name>class transformers.integrations.HfDeepSpeedConfig</name><anchor>transformers.integrations.HfDeepSpeedConfig</anchor><source>https://github.com/huggingface/transformers/blob/vr_33892/src/transformers/integrations/deepspeed.py#L57</source><parameters>[{"name": "config_file_or_dict", "val": ""}]</parameters><paramsdesc>- **config_file_or_dict** (`Union[str, Dict]`) -- path to DeepSpeed config file or dict.</paramsdesc><paramgroups>0</paramgroups></docstring> | |
| This object contains a DeepSpeed configuration dictionary and can be quickly queried for things like zero stage. | |
| A `weakref` of this object is stored in the module's globals to be able to access the config from areas where | |
| things like the Trainer object is not available (e.g. `from_pretrained` and `_get_resized_embeddings`). Therefore | |
| it's important that this object remains alive while the program is still running. | |
| [Trainer](/docs/transformers/pr_33892/en/main_classes/trainer#transformers.Trainer) uses the `HfTrainerDeepSpeedConfig` subclass instead. That subclass has logic to sync the configuration | |
| with values of [TrainingArguments](/docs/transformers/pr_33892/en/main_classes/trainer#transformers.TrainingArguments) by replacing special placeholder values: `"auto"`. Without this special logic | |
| the DeepSpeed configuration is not modified in any way. | |
| </div> | |
| <EditOnGithub source="https://github.com/huggingface/transformers/blob/main/docs/source/en/main_classes/deepspeed.md" /> |
Xet Storage Details
- Size:
- 2.58 kB
- Xet hash:
- 7538be622a950bcad4187927b00ee594e0e1d38fd3374ab868894485427275cb
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.