Buckets:

rtrm's picture
|
download
raw
2.58 kB
# DeepSpeed
[DeepSpeed](https://github.com/deepspeedai/DeepSpeed), powered by Zero Redundancy Optimizer (ZeRO), is an optimization library for training and fitting very large models onto a GPU. It is available in several ZeRO stages, where each stage progressively saves more GPU memory by partitioning the optimizer state, gradients, parameters, and enabling offloading to a CPU or NVMe. DeepSpeed is integrated with the [Trainer](/docs/transformers/pr_33892/en/main_classes/trainer#transformers.Trainer) class and most of the setup is automatically taken care of for you.
However, if you want to use DeepSpeed without the [Trainer](/docs/transformers/pr_33892/en/main_classes/trainer#transformers.Trainer), Transformers provides a `HfDeepSpeedConfig` class.
<Tip>
Learn more about using DeepSpeed with [Trainer](/docs/transformers/pr_33892/en/main_classes/trainer#transformers.Trainer) in the [DeepSpeed](../deepspeed) guide.
</Tip>
## HfDeepSpeedConfig[[transformers.integrations.HfDeepSpeedConfig]]
<div class="docstring border-l-2 border-t-2 pl-4 pt-3.5 border-gray-100 rounded-tl-xl mb-6 mt-8">
<docstring><name>class transformers.integrations.HfDeepSpeedConfig</name><anchor>transformers.integrations.HfDeepSpeedConfig</anchor><source>https://github.com/huggingface/transformers/blob/vr_33892/src/transformers/integrations/deepspeed.py#L57</source><parameters>[{"name": "config_file_or_dict", "val": ""}]</parameters><paramsdesc>- **config_file_or_dict** (`Union[str, Dict]`) -- path to DeepSpeed config file or dict.</paramsdesc><paramgroups>0</paramgroups></docstring>
This object contains a DeepSpeed configuration dictionary and can be quickly queried for things like zero stage.
A `weakref` of this object is stored in the module's globals to be able to access the config from areas where
things like the Trainer object is not available (e.g. `from_pretrained` and `_get_resized_embeddings`). Therefore
it's important that this object remains alive while the program is still running.
[Trainer](/docs/transformers/pr_33892/en/main_classes/trainer#transformers.Trainer) uses the `HfTrainerDeepSpeedConfig` subclass instead. That subclass has logic to sync the configuration
with values of [TrainingArguments](/docs/transformers/pr_33892/en/main_classes/trainer#transformers.TrainingArguments) by replacing special placeholder values: `"auto"`. Without this special logic
the DeepSpeed configuration is not modified in any way.
</div>
<EditOnGithub source="https://github.com/huggingface/transformers/blob/main/docs/source/en/main_classes/deepspeed.md" />

Xet Storage Details

Size:
2.58 kB
·
Xet hash:
7538be622a950bcad4187927b00ee594e0e1d38fd3374ab868894485427275cb

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.