Buckets:

rtrm's picture
|
download
raw
5.67 kB

AdEMAMix

AdEMAMix is a variant of the Adam optimizer.

bitsandbytes also supports paged optimizers which take advantage of CUDAs unified memory to transfer memory from the GPU to the CPU when GPU memory is exhausted.

AdEMAMix[[api-class]][[bitsandbytes.optim.AdEMAMix]]

bitsandbytes.optim.AdEMAMix[[bitsandbytes.optim.AdEMAMix]]

Source

__init__bitsandbytes.optim.AdEMAMix.__init__https://github.com/bitsandbytes-foundation/bitsandbytes/blob/v0.49.2/bitsandbytes/optim/ademamix.py#L108[{"name": "params", "val": ": Iterable"}, {"name": "lr", "val": ": float = 0.001"}, {"name": "betas", "val": ": tuple = (0.9, 0.999, 0.9999)"}, {"name": "alpha", "val": ": float = 5.0"}, {"name": "t_alpha", "val": ": typing.Optional[int] = None"}, {"name": "t_beta3", "val": ": typing.Optional[int] = None"}, {"name": "eps", "val": ": float = 1e-08"}, {"name": "weight_decay", "val": ": float = 0.01"}, {"name": "optim_bits", "val": ": typing.Literal[8, 32] = 32"}, {"name": "min_8bit_size", "val": ": int = 4096"}, {"name": "is_paged", "val": ": bool = False"}]

AdEMAMix8bit[[bitsandbytes.optim.AdEMAMix8bit]]

bitsandbytes.optim.AdEMAMix8bit[[bitsandbytes.optim.AdEMAMix8bit]]

Source

__init__bitsandbytes.optim.AdEMAMix8bit.__init__https://github.com/bitsandbytes-foundation/bitsandbytes/blob/v0.49.2/bitsandbytes/optim/ademamix.py#L275[{"name": "params", "val": ": Iterable"}, {"name": "lr", "val": ": float = 0.001"}, {"name": "betas", "val": ": tuple = (0.9, 0.999, 0.9999)"}, {"name": "alpha", "val": ": float = 5.0"}, {"name": "t_alpha", "val": ": typing.Optional[int] = None"}, {"name": "t_beta3", "val": ": typing.Optional[int] = None"}, {"name": "eps", "val": ": float = 1e-08"}, {"name": "weight_decay", "val": ": float = 0.01"}, {"name": "min_8bit_size", "val": ": int = 4096"}, {"name": "is_paged", "val": ": bool = False"}]

AdEMAMix32bit[[bitsandbytes.optim.AdEMAMix32bit]]

bitsandbytes.optim.AdEMAMix32bit[[bitsandbytes.optim.AdEMAMix32bit]]

Source

__init__bitsandbytes.optim.AdEMAMix32bit.__init__https://github.com/bitsandbytes-foundation/bitsandbytes/blob/v0.49.2/bitsandbytes/optim/ademamix.py#L360[{"name": "params", "val": ": Iterable"}, {"name": "lr", "val": ": float = 0.001"}, {"name": "betas", "val": ": tuple = (0.9, 0.999, 0.9999)"}, {"name": "alpha", "val": ": float = 5.0"}, {"name": "t_alpha", "val": ": typing.Optional[int] = None"}, {"name": "t_beta3", "val": ": typing.Optional[int] = None"}, {"name": "eps", "val": ": float = 1e-08"}, {"name": "weight_decay", "val": ": float = 0.01"}, {"name": "min_8bit_size", "val": ": int = 4096"}, {"name": "is_paged", "val": ": bool = False"}]

PagedAdEMAMix[[bitsandbytes.optim.PagedAdEMAMix]]

bitsandbytes.optim.PagedAdEMAMix[[bitsandbytes.optim.PagedAdEMAMix]]

Source

__init__bitsandbytes.optim.PagedAdEMAMix.__init__https://github.com/bitsandbytes-foundation/bitsandbytes/blob/v0.49.2/bitsandbytes/optim/ademamix.py#L331[{"name": "params", "val": ": Iterable"}, {"name": "lr", "val": ": float = 0.001"}, {"name": "betas", "val": ": tuple = (0.9, 0.999, 0.9999)"}, {"name": "alpha", "val": ": float = 5.0"}, {"name": "t_alpha", "val": ": typing.Optional[int] = None"}, {"name": "t_beta3", "val": ": typing.Optional[int] = None"}, {"name": "eps", "val": ": float = 1e-08"}, {"name": "weight_decay", "val": ": float = 0.01"}, {"name": "optim_bits", "val": ": typing.Literal[8, 32] = 32"}, {"name": "min_8bit_size", "val": ": int = 4096"}]

PagedAdEMAMix8bit[[bitsandbytes.optim.PagedAdEMAMix8bit]]

bitsandbytes.optim.PagedAdEMAMix8bit[[bitsandbytes.optim.PagedAdEMAMix8bit]]

Source

__init__bitsandbytes.optim.PagedAdEMAMix8bit.__init__https://github.com/bitsandbytes-foundation/bitsandbytes/blob/v0.49.2/bitsandbytes/optim/ademamix.py#L304[{"name": "params", "val": ": Iterable"}, {"name": "lr", "val": ": float = 0.001"}, {"name": "betas", "val": ": tuple = (0.9, 0.999, 0.9999)"}, {"name": "alpha", "val": ": float = 5.0"}, {"name": "t_alpha", "val": ": typing.Optional[int] = None"}, {"name": "t_beta3", "val": ": typing.Optional[int] = None"}, {"name": "eps", "val": ": float = 1e-08"}, {"name": "weight_decay", "val": ": float = 0.01"}, {"name": "min_8bit_size", "val": ": int = 4096"}]

PagedAdEMAMix32bit[[bitsandbytes.optim.PagedAdEMAMix32bit]]

bitsandbytes.optim.PagedAdEMAMix32bit[[bitsandbytes.optim.PagedAdEMAMix32bit]]

Source

__init__bitsandbytes.optim.PagedAdEMAMix32bit.__init__https://github.com/bitsandbytes-foundation/bitsandbytes/blob/v0.49.2/bitsandbytes/optim/ademamix.py#L393[{"name": "params", "val": ": Iterable"}, {"name": "lr", "val": ": float = 0.001"}, {"name": "betas", "val": ": tuple = (0.9, 0.999, 0.9999)"}, {"name": "alpha", "val": ": float = 5.0"}, {"name": "t_alpha", "val": ": typing.Optional[int] = None"}, {"name": "t_beta3", "val": ": typing.Optional[int] = None"}, {"name": "eps", "val": ": float = 1e-08"}, {"name": "weight_decay", "val": ": float = 0.01"}, {"name": "min_8bit_size", "val": ": int = 4096"}]

Xet Storage Details

Size:
5.67 kB
·
Xet hash:
6d031197abcc49cd99f762833971e52e0b29955d5f5ffe01fff38b229f3a0c54

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.