Buckets:

hf-doc-build
/

doc

Files

xet

hf-doc-build/doc / bitsandbytes /main /en /reference /optim /adagrad.md

HuggingFaceDocBuilder

5 days ago

preview code

download

raw

7.25 kB

AdaGrad

AdaGrad (Adaptive Gradient) is an adaptive learning rate optimizer. AdaGrad stores a sum of the squared past gradients for each parameter and uses it to scale their learning rate. This allows the learning rate to be automatically lower or higher depending on the magnitude of the gradient, eliminating the need to manually tune the learning rate.

Adagrad[[api-class]][[bitsandbytes.optim.Adagrad]]

bitsandbytes.optim.Adagrad[[bitsandbytes.optim.Adagrad]]

Source

__init__bitsandbytes.optim.Adagrad.__init__https://github.com/bitsandbytes-foundation/bitsandbytes/blob/main/bitsandbytes/optim/adagrad.py#L9[{"name": "params", "val": ""}, {"name": "lr", "val": " = 0.01"}, {"name": "lr_decay", "val": " = 0"}, {"name": "weight_decay", "val": " = 0"}, {"name": "initial_accumulator_value", "val": " = 0"}, {"name": "eps", "val": " = 1e-10"}, {"name": "optim_bits", "val": " = 32"}, {"name": "args", "val": " = None"}, {"name": "min_8bit_size", "val": " = 4096"}]- params (torch.tensor) -- The input parameters to optimize.

lr (float, defaults to 1e-2) -- The learning rate.
lr_decay (int, defaults to 0) -- The learning rate decay.
weight_decay (float, defaults to 0.0) -- The weight decay value for the optimizer.
initial_accumulator_value (int, defaults to 0) -- The initial momemtum values.
eps (float, defaults to 1e-10) -- The epsilon value prevents division by zero in the optimizer.
optim_bits (int, defaults to 32) -- The number of bits of the optimizer state.
args (object, defaults to None) -- An object with additional arguments.
min_8bit_size (int, defaults to 4096) -- The minimum number of elements of the parameter tensors for 8-bit optimization.0

Base Adagrad optimizer.

Parameters:

params (torch.tensor) : The input parameters to optimize.

lr (float, defaults to 1e-2) : The learning rate.

lr_decay (int, defaults to 0) : The learning rate decay.

weight_decay (float, defaults to 0.0) : The weight decay value for the optimizer.

initial_accumulator_value (int, defaults to 0) : The initial momemtum values.

eps (float, defaults to 1e-10) : The epsilon value prevents division by zero in the optimizer.

optim_bits (int, defaults to 32) : The number of bits of the optimizer state.

args (object, defaults to None) : An object with additional arguments.

min_8bit_size (int, defaults to 4096) : The minimum number of elements of the parameter tensors for 8-bit optimization.

Adagrad8bit[[bitsandbytes.optim.Adagrad8bit]]

bitsandbytes.optim.Adagrad8bit[[bitsandbytes.optim.Adagrad8bit]]

Source

__init__bitsandbytes.optim.Adagrad8bit.__init__https://github.com/bitsandbytes-foundation/bitsandbytes/blob/main/bitsandbytes/optim/adagrad.py#L68[{"name": "params", "val": ""}, {"name": "lr", "val": " = 0.01"}, {"name": "lr_decay", "val": " = 0"}, {"name": "weight_decay", "val": " = 0"}, {"name": "initial_accumulator_value", "val": " = 0"}, {"name": "eps", "val": " = 1e-10"}, {"name": "optim_bits", "val": " = 8"}, {"name": "args", "val": " = None"}, {"name": "min_8bit_size", "val": " = 4096"}]- params (torch.tensor) -- The input parameters to optimize.

lr (float, defaults to 1e-2) -- The learning rate.
lr_decay (int, defaults to 0) -- The learning rate decay.
weight_decay (float, defaults to 0.0) -- The weight decay value for the optimizer.
initial_accumulator_value (int, defaults to 0) -- The initial momemtum values.
eps (float, defaults to 1e-10) -- The epsilon value prevents division by zero in the optimizer.
optim_bits (int, defaults to 8) -- The number of bits of the optimizer state.
args (object, defaults to None) -- An object with additional arguments.
min_8bit_size (int, defaults to 4096) -- The minimum number of elements of the parameter tensors for 8-bit optimization.0

8-bit Adagrad optimizer.

Parameters:

params (torch.tensor) : The input parameters to optimize.

lr (float, defaults to 1e-2) : The learning rate.

lr_decay (int, defaults to 0) : The learning rate decay.

weight_decay (float, defaults to 0.0) : The weight decay value for the optimizer.

initial_accumulator_value (int, defaults to 0) : The initial momemtum values.

eps (float, defaults to 1e-10) : The epsilon value prevents division by zero in the optimizer.

optim_bits (int, defaults to 8) : The number of bits of the optimizer state.

args (object, defaults to None) : An object with additional arguments.

min_8bit_size (int, defaults to 4096) : The minimum number of elements of the parameter tensors for 8-bit optimization.

Adagrad32bit[[bitsandbytes.optim.Adagrad32bit]]

bitsandbytes.optim.Adagrad32bit[[bitsandbytes.optim.Adagrad32bit]]

Source

__init__bitsandbytes.optim.Adagrad32bit.__init__https://github.com/bitsandbytes-foundation/bitsandbytes/blob/main/bitsandbytes/optim/adagrad.py#L127[{"name": "params", "val": ""}, {"name": "lr", "val": " = 0.01"}, {"name": "lr_decay", "val": " = 0"}, {"name": "weight_decay", "val": " = 0"}, {"name": "initial_accumulator_value", "val": " = 0"}, {"name": "eps", "val": " = 1e-10"}, {"name": "optim_bits", "val": " = 32"}, {"name": "args", "val": " = None"}, {"name": "min_8bit_size", "val": " = 4096"}]- params (torch.tensor) -- The input parameters to optimize.

lr (float, defaults to 1e-2) -- The learning rate.
lr_decay (int, defaults to 0) -- The learning rate decay.
weight_decay (float, defaults to 0.0) -- The weight decay value for the optimizer.
initial_accumulator_value (int, defaults to 0) -- The initial momemtum values.
eps (float, defaults to 1e-10) -- The epsilon value prevents division by zero in the optimizer.
optim_bits (int, defaults to 32) -- The number of bits of the optimizer state.
args (object, defaults to None) -- An object with additional arguments.
min_8bit_size (int, defaults to 4096) -- The minimum number of elements of the parameter tensors for 8-bit optimization.0

32-bit Adagrad optimizer.

Parameters:

params (torch.tensor) : The input parameters to optimize.

lr (float, defaults to 1e-2) : The learning rate.

lr_decay (int, defaults to 0) : The learning rate decay.

weight_decay (float, defaults to 0.0) : The weight decay value for the optimizer.

initial_accumulator_value (int, defaults to 0) : The initial momemtum values.

eps (float, defaults to 1e-10) : The epsilon value prevents division by zero in the optimizer.

optim_bits (int, defaults to 32) : The number of bits of the optimizer state.

args (object, defaults to None) : An object with additional arguments.

min_8bit_size (int, defaults to 4096) : The minimum number of elements of the parameter tensors for 8-bit optimization.

Xet Storage Details

Size:: 7.25 kB
Xet hash:: aa7dd2f45ecac16e9ccf058274459b4f9056472a161199c1db7d5898fcd51264

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.