TRI-ML
/

DCLM-1B

Model card Files Files and versions

Resources

View closed (2)

Is this model supported for finetuning with flash attention ?

#4 opened 12 months ago by

MMLU Performance After Token Training

#3 opened almost 2 years ago by