·
AI & ML interests
None yet
Organizations
Litzy0619/MIS0711_scaling_103_warmup_10_lr_3e3_nodistill_custom_inverse_sqrt_decay_085_plaueau_10_clip_10
Updated
Litzy0619/MIS0711_scaling_103_warmup_10_lr_3e3_nodistill_custom_inverse_sqrt_decay_085_plaueau_10
Updated
Litzy0619/MIS0711_scaling_103_warmup_10_lr_3e3_nodistill_custom_inverse_sqrt_decay_085_plaueau_10_clip_100
Updated
Litzy0619/MIS0711_scaling_103_warmup_10_lr_3e3_nodistill_custom_inverse_sqrt_decay_085
Updated
Litzy0619/MIS0711_scaling_103_warmup_10_lr_3e3_nodistill_custom_inverse_sqrt_decay_095
Updated
Litzy0619/MIS0711_scaling_103_warmup_10_lr_3e3_nodistill_custom_inverse_sqrt_decay_093
Updated
Litzy0619/MIS0711_scaling_103_warmup_10_lr_3e3_nodistill_custom_inverse_sqrt_decay_090
Updated
Litzy0619/MIS0711_scaling_103_warmup_20_lr_3e3_nodistill_custom_inverse_sqrt_decay_085
Updated
Litzy0619/MIS0711_scaling_103_warmup_20_lr_3e3_nodistill_custom_inverse_sqrt_decay_090
Updated
Litzy0619/MIS0711_scaling_103_warmup_20_lr_3e3_nodistill_custom_inverse_sqrt_decay_093
Updated
Litzy0619/MIS0711_scaling_103_warmup_20_lr_3e3_nodistill_custom_inverse_sqrt_decay_095
Updated
Litzy0619/MIS0711_scaling_103_warmup_10_lr_3e3_nodistill_inverse_sqrt
Updated
Litzy0619/MIS0711_scaling_103_warmup_20_lr_3e3_nodistill_inverse_sqrt
Updated
Litzy0619/MIS0711_scaling_103_warmup_10_lr_1e3_nodistill_inverse_sqrt
Updated
Litzy0619/MIS0711_scaling_103_warmup_10_lr_2e3_nodistill_inverse_sqrt
Updated
Litzy0619/MIS0711_scaling_103_warmup_30_lr_1e4_nodistill_cosine
Updated
Litzy0619/MIS0711_scaling_103_warmup_40_lr_3e3_nodistill_inverse_sqrt
Updated
Litzy0619/MIS0711_scaling_103_warmup_30_lr_2e4_nodistill_cosine
Updated
Litzy0619/MIS0711_scaling_103_warmup_30_lr_3e4_nodistill_cosine
Updated
Litzy0619/MIS0711_scaling_103_warmup_30_lr_4e4_nodistill_cosine
Updated
Litzy0619/MIS0711_scaling_103_warmup_30_lr_5e4_nodistill_cosine
Updated
Litzy0619/MIS0711_scaling_103_warmup_30_lr_6e4_nodistill_cosine
Updated
Litzy0619/MIS0711_scaling_103_warmup_30_lr_9e4_nodistill_cosine
Updated
Litzy0619/MIS0711_scaling_103_warmup_30_lr_1e3_nodistill_cosine
Updated
Litzy0619/MIS0711_scaling_103_warmup_30_lr_2e3_nodistill_cosine
Updated
Litzy0619/MIS0711_scaling_103_warmup_30_lr_7e4_nodistill_cosine
Updated
Litzy0619/MIS0711_scaling_103_warmup_30_lr_8e4_nodistill_cosine
Updated
Litzy0619/MIS0711_scaling_103_warmup_40_lr_3e3_nodistill_polynomial
Updated
Litzy0619/MIS0711_scaling_103_warmup_40_lr_3e3_nodistill_constant_with_warmup_bs_192
Updated
Litzy0619/MIS0711_scaling_103_warmup_40_lr_3e3_nodistill_constant_bs_192
Updated