How to use mniknam/llama32-entity_only with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("mniknam/llama32-entity_only", dtype="auto")