How to use TRI-ML/DCLM-1B with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("TRI-ML/DCLM-1B", dtype="auto")
Is this model supported for finetuning with flash attention 2 ?
· Sign up or log in to comment