haoranli-ml/lcft_gemma-2b_prolong-gemma-parts_ProLong64KMix_bsz256_steps1250_lr1e-5_warmup0.1_rope200000hard Text Generation • 3B • Updated about 19 hours ago • 53
haoranli-ml/sft_Gemma-2B-RoPE-Base_ultrachat_bsz256_steps63_lr2e-5_warmup0.05 Text Generation • 3B • Updated 1 day ago • 29
haoranli-ml/sft_Gemma-2B-CoPE-Base_ultrachat_bsz256_steps63_lr2e-5_warmup0.05 Text Generation • 3B • Updated 1 day ago • 31
haoranli-ml/lcft_gemma-2b_prolong-gemma-parts_ProLong64KMix_bsz256_steps1250_lr1e-5_warmup0.1_rope200000cope Text Generation • 3B • Updated 1 day ago • 378
haoranli-ml/lcft_gemma-2b_prolong-gemma-parts_ProLong64KMix_bsz256_steps1250_lr1e-5_warmup0.1_rope Text Generation • 3B • Updated 1 day ago • 189
haoranli-ml/Gemma-2B-RoPE-Instruct_wrongPretrain_and_sft_data Text Generation • 3B • Updated 3 days ago • 1
haoranli-ml/NOBOS-lcft_gemma-2b_prolong-gemma-parts_ProLong64KMix_bsz256_steps1250_lr1e-5_warmup0.1_cope Text Generation • 3B • Updated 3 days ago • 177
haoranli-ml/VF_future_prompts_hl_gauss_corrected_cdf_bin_012sigma_future_condition_1_18 Updated Jan 18