Buckets:

rtrm's picture
|
download
raw
5.67 kB

AdEMAMix

AdEMAMix is a variant of the Adam optimizer.

bitsandbytes also supports paged optimizers which take advantage of CUDAs unified memory to transfer memory from the GPU to the CPU when GPU memory is exhausted.

AdEMAMix[[api-class]][[bitsandbytes.optim.AdEMAMix]]

bitsandbytes.optim.AdEMAMix[[bitsandbytes.optim.AdEMAMix]]

Source

__init__bitsandbytes.optim.AdEMAMix.__init__https://github.com/bitsandbytes-foundation/bitsandbytes/blob/vr_1908/bitsandbytes/optim/ademamix.py#L108[{"name": "params", "val": ": Iterable"}, {"name": "lr", "val": ": float = 0.001"}, {"name": "betas", "val": ": tuple = (0.9, 0.999, 0.9999)"}, {"name": "alpha", "val": ": float = 5.0"}, {"name": "t_alpha", "val": ": typing.Optional[int] = None"}, {"name": "t_beta3", "val": ": typing.Optional[int] = None"}, {"name": "eps", "val": ": float = 1e-08"}, {"name": "weight_decay", "val": ": float = 0.01"}, {"name": "optim_bits", "val": ": typing.Literal[8, 32] = 32"}, {"name": "min_8bit_size", "val": ": int = 4096"}, {"name": "is_paged", "val": ": bool = False"}]

AdEMAMix8bit[[bitsandbytes.optim.AdEMAMix8bit]]

bitsandbytes.optim.AdEMAMix8bit[[bitsandbytes.optim.AdEMAMix8bit]]

Source

__init__bitsandbytes.optim.AdEMAMix8bit.__init__https://github.com/bitsandbytes-foundation/bitsandbytes/blob/vr_1908/bitsandbytes/optim/ademamix.py#L271[{"name": "params", "val": ": Iterable"}, {"name": "lr", "val": ": float = 0.001"}, {"name": "betas", "val": ": tuple = (0.9, 0.999, 0.9999)"}, {"name": "alpha", "val": ": float = 5.0"}, {"name": "t_alpha", "val": ": typing.Optional[int] = None"}, {"name": "t_beta3", "val": ": typing.Optional[int] = None"}, {"name": "eps", "val": ": float = 1e-08"}, {"name": "weight_decay", "val": ": float = 0.01"}, {"name": "min_8bit_size", "val": ": int = 4096"}, {"name": "is_paged", "val": ": bool = False"}]

AdEMAMix32bit[[bitsandbytes.optim.AdEMAMix32bit]]

bitsandbytes.optim.AdEMAMix32bit[[bitsandbytes.optim.AdEMAMix32bit]]

Source

__init__bitsandbytes.optim.AdEMAMix32bit.__init__https://github.com/bitsandbytes-foundation/bitsandbytes/blob/vr_1908/bitsandbytes/optim/ademamix.py#L356[{"name": "params", "val": ": Iterable"}, {"name": "lr", "val": ": float = 0.001"}, {"name": "betas", "val": ": tuple = (0.9, 0.999, 0.9999)"}, {"name": "alpha", "val": ": float = 5.0"}, {"name": "t_alpha", "val": ": typing.Optional[int] = None"}, {"name": "t_beta3", "val": ": typing.Optional[int] = None"}, {"name": "eps", "val": ": float = 1e-08"}, {"name": "weight_decay", "val": ": float = 0.01"}, {"name": "min_8bit_size", "val": ": int = 4096"}, {"name": "is_paged", "val": ": bool = False"}]

PagedAdEMAMix[[bitsandbytes.optim.PagedAdEMAMix]]

bitsandbytes.optim.PagedAdEMAMix[[bitsandbytes.optim.PagedAdEMAMix]]

Source

__init__bitsandbytes.optim.PagedAdEMAMix.__init__https://github.com/bitsandbytes-foundation/bitsandbytes/blob/vr_1908/bitsandbytes/optim/ademamix.py#L327[{"name": "params", "val": ": Iterable"}, {"name": "lr", "val": ": float = 0.001"}, {"name": "betas", "val": ": tuple = (0.9, 0.999, 0.9999)"}, {"name": "alpha", "val": ": float = 5.0"}, {"name": "t_alpha", "val": ": typing.Optional[int] = None"}, {"name": "t_beta3", "val": ": typing.Optional[int] = None"}, {"name": "eps", "val": ": float = 1e-08"}, {"name": "weight_decay", "val": ": float = 0.01"}, {"name": "optim_bits", "val": ": typing.Literal[8, 32] = 32"}, {"name": "min_8bit_size", "val": ": int = 4096"}]

PagedAdEMAMix8bit[[bitsandbytes.optim.PagedAdEMAMix8bit]]

bitsandbytes.optim.PagedAdEMAMix8bit[[bitsandbytes.optim.PagedAdEMAMix8bit]]

Source

__init__bitsandbytes.optim.PagedAdEMAMix8bit.__init__https://github.com/bitsandbytes-foundation/bitsandbytes/blob/vr_1908/bitsandbytes/optim/ademamix.py#L300[{"name": "params", "val": ": Iterable"}, {"name": "lr", "val": ": float = 0.001"}, {"name": "betas", "val": ": tuple = (0.9, 0.999, 0.9999)"}, {"name": "alpha", "val": ": float = 5.0"}, {"name": "t_alpha", "val": ": typing.Optional[int] = None"}, {"name": "t_beta3", "val": ": typing.Optional[int] = None"}, {"name": "eps", "val": ": float = 1e-08"}, {"name": "weight_decay", "val": ": float = 0.01"}, {"name": "min_8bit_size", "val": ": int = 4096"}]

PagedAdEMAMix32bit[[bitsandbytes.optim.PagedAdEMAMix32bit]]

bitsandbytes.optim.PagedAdEMAMix32bit[[bitsandbytes.optim.PagedAdEMAMix32bit]]

Source

__init__bitsandbytes.optim.PagedAdEMAMix32bit.__init__https://github.com/bitsandbytes-foundation/bitsandbytes/blob/vr_1908/bitsandbytes/optim/ademamix.py#L387[{"name": "params", "val": ": Iterable"}, {"name": "lr", "val": ": float = 0.001"}, {"name": "betas", "val": ": tuple = (0.9, 0.999, 0.9999)"}, {"name": "alpha", "val": ": float = 5.0"}, {"name": "t_alpha", "val": ": typing.Optional[int] = None"}, {"name": "t_beta3", "val": ": typing.Optional[int] = None"}, {"name": "eps", "val": ": float = 1e-08"}, {"name": "weight_decay", "val": ": float = 0.01"}, {"name": "min_8bit_size", "val": ": int = 4096"}]

Xet Storage Details

Size:
5.67 kB
·
Xet hash:
7d0acf8264f58ca0ea6ddee2a1a06388ecef1163f9be6f4f84dd0fcd801185db

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.