Buckets:

|
download
raw
7.82 kB
# Kwargs handlers
The following objects can be passed to the main [Accelerator](/docs/accelerate/pr_4021/en/package_reference/accelerator#accelerate.Accelerator) to customize how some PyTorch objects
related to distributed training or mixed precision are created.
## AutocastKwargs[[accelerate.AutocastKwargs]]
#### accelerate.AutocastKwargs[[accelerate.AutocastKwargs]]
[Source](https://github.com/huggingface/accelerate/blob/vr_4021/src/accelerate/utils/dataclasses.py#L113)
Use this object in your [Accelerator](/docs/accelerate/pr_4021/en/package_reference/accelerator#accelerate.Accelerator) to customize how `torch.autocast` behaves. Please refer to the
documentation of this [context manager](https://pytorch.org/docs/stable/amp.html#torch.autocast) for more
information on each argument.
Example:
```python
from accelerate import Accelerator
from accelerate.utils import AutocastKwargs
kwargs = AutocastKwargs(cache_enabled=True)
accelerator = Accelerator(kwargs_handlers=[kwargs])
```
## DistributedDataParallelKwargs[[accelerate.DistributedDataParallelKwargs]]
#### accelerate.DistributedDataParallelKwargs[[accelerate.DistributedDataParallelKwargs]]
[Source](https://github.com/huggingface/accelerate/blob/vr_4021/src/accelerate/utils/dataclasses.py#L155)
Use this object in your [Accelerator](/docs/accelerate/pr_4021/en/package_reference/accelerator#accelerate.Accelerator) to customize how your model is wrapped in a
`torch.nn.parallel.DistributedDataParallel`. Please refer to the documentation of this
[wrapper](https://pytorch.org/docs/stable/generated/torch.nn.parallel.DistributedDataParallel.html) for more
information on each argument.
`gradient_as_bucket_view` is only available in PyTorch 1.7.0 and later versions.
`static_graph` is only available in PyTorch 1.11.0 and later versions.
Example:
```python
from accelerate import Accelerator
from accelerate.utils import DistributedDataParallelKwargs
kwargs = DistributedDataParallelKwargs(find_unused_parameters=True)
accelerator = Accelerator(kwargs_handlers=[kwargs])
```
## FP8RecipeKwargs[[accelerate.utils.FP8RecipeKwargs]]
#### accelerate.utils.FP8RecipeKwargs[[accelerate.utils.FP8RecipeKwargs]]
[Source](https://github.com/huggingface/accelerate/blob/vr_4021/src/accelerate/utils/dataclasses.py#L455)
Deprecated. Please use one of the proper FP8 recipe kwargs classes such as `TERecipeKwargs` or `MSAMPRecipeKwargs`
instead.
## ProfileKwargs[[accelerate.ProfileKwargs]]
#### accelerate.ProfileKwargs[[accelerate.ProfileKwargs]]
[Source](https://github.com/huggingface/accelerate/blob/vr_4021/src/accelerate/utils/dataclasses.py#L484)
Use this object in your [Accelerator](/docs/accelerate/pr_4021/en/package_reference/accelerator#accelerate.Accelerator) to customize the initialization of the profiler. Please refer to the
documentation of this [context manager](https://pytorch.org/docs/stable/profiler.html#torch.profiler.profile) for
more information on each argument.
`torch.profiler` is only available in PyTorch 1.8.1 and later versions.
Example:
```python
from accelerate import Accelerator
from accelerate.utils import ProfileKwargs
kwargs = ProfileKwargs(activities=["cpu", "cuda"])
accelerator = Accelerator(kwargs_handlers=[kwargs])
```
buildaccelerate.ProfileKwargs.buildhttps://github.com/huggingface/accelerate/blob/vr_4021/src/accelerate/utils/dataclasses.py#L574[]torch.profiler.profileThe profiler object.
Build a profiler object with the current configuration.
**Parameters:**
activities (`List[str]`, *optional*, default to `None`) : The list of activity groups to use in profiling. Must be one of `"cpu"`, `"xpu"`, `"mtia"`, "hpu" or `"cuda"`.
schedule_option (`Dict[str, int]`, *optional*, default to `None`) : The schedule option to use for the profiler. Available keys are `wait`, `warmup`, `active`, `repeat` and `skip_first`. The profiler will skip the first `skip_first` steps, then wait for `wait` steps, then do the warmup for the next `warmup` steps, then do the active recording for the next `active` steps and then repeat the cycle starting with `wait` steps. The optional number of cycles is specified with the `repeat` parameter, the zero value means that the cycles will continue until the profiling is finished.
on_trace_ready (`Callable`, *optional*, default to `None`) : Callable that is called at each step when schedule returns `ProfilerAction.RECORD_AND_SAVE` during the profiling.
record_shapes (`bool`, *optional*, default to `False`) : Save information about operator’s input shapes.
profile_memory (`bool`, *optional*, default to `False`) : Track tensor memory allocation/deallocation
with_stack (`bool`, *optional*, default to `False`) : Record source information (file and line number) for the ops.
with_flops (`bool`, *optional*, default to `False`) : Use formula to estimate the FLOPS of specific operators
with_modules (`bool`, *optional*, default to `False`) : Record module hierarchy (including function names) corresponding to the callstack of the op.
output_trace_dir (`str`, *optional*, default to `None`) : Exports the collected trace in Chrome JSON format. Chrome use 'chrome://tracing' view json file. Defaults to None, which means profiling does not store json files.
**Returns:**
`torch.profiler.profile`
The profiler object.
## GradScalerKwargs[[accelerate.GradScalerKwargs]]
#### accelerate.GradScalerKwargs[[accelerate.GradScalerKwargs]]
[Source](https://github.com/huggingface/accelerate/blob/vr_4021/src/accelerate/utils/dataclasses.py#L241)
Use this object in your [Accelerator](/docs/accelerate/pr_4021/en/package_reference/accelerator#accelerate.Accelerator) to customize the behavior of mixed precision, specifically how the
`torch.amp.GradScaler` or `torch.cuda.amp.GradScaler` used is created. Please refer to the documentation of this
[scaler](https://pytorch.org/docs/stable/amp.html?highlight=gradscaler) for more information on each argument.
`torch.cuda.amp.GradScaler` is only available in PyTorch 1.5.0 and later versions, and `torch.amp.GradScaler` is
only available in PyTorch 2.4.0 and later versions.
Example:
```python
from accelerate import Accelerator
from accelerate.utils import GradScalerKwargs
kwargs = GradScalerKwargs(backoff_factor=0.25)
accelerator = Accelerator(kwargs_handlers=[kwargs])
```
## InitProcessGroupKwargs[[accelerate.InitProcessGroupKwargs]]
#### accelerate.InitProcessGroupKwargs[[accelerate.InitProcessGroupKwargs]]
[Source](https://github.com/huggingface/accelerate/blob/vr_4021/src/accelerate/utils/dataclasses.py#L273)
Use this object in your [Accelerator](/docs/accelerate/pr_4021/en/package_reference/accelerator#accelerate.Accelerator) to customize the initialization of the distributed processes. Please refer
to the documentation of this
[method](https://pytorch.org/docs/stable/distributed.html#torch.distributed.init_process_group) for more
information on each argument.
Note: If `timeout` is set to `None`, the default will be based upon how `backend` is set.
```python
from datetime import timedelta
from accelerate import Accelerator
from accelerate.utils import InitProcessGroupKwargs
kwargs = InitProcessGroupKwargs(timeout=timedelta(seconds=800))
accelerator = Accelerator(kwargs_handlers=[kwargs])
```
## KwargsHandler[[accelerate.utils.KwargsHandler]]
#### accelerate.utils.KwargsHandler[[accelerate.utils.KwargsHandler]]
[Source](https://github.com/huggingface/accelerate/blob/vr_4021/src/accelerate/utils/dataclasses.py#L68)
Internal mixin that implements a `to_kwargs()` method for a dataclass.
to_kwargsaccelerate.utils.KwargsHandler.to_kwargshttps://github.com/huggingface/accelerate/blob/vr_4021/src/accelerate/utils/dataclasses.py#L76[]
Returns a dictionary containing the attributes with values different from the default of this class.

Xet Storage Details

Size:
7.82 kB
·
Xet hash:
746dda213b186f5bbeefb3cfaa26a67df330239e377ca1fa61ac66625dfc07cb

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.