How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="RedHatAI/mpt-7b-chat-pruned50-quant-ds", trust_remote_code=True)
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("RedHatAI/mpt-7b-chat-pruned50-quant-ds", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("RedHatAI/mpt-7b-chat-pruned50-quant-ds", trust_remote_code=True)
Quick Links

Sparse MPT-7B-Chat - DeepSparse

Chat-aligned MPT 7b model pruned to 50% and quantized using SparseGPT for inference with DeepSparse

from deepsparse import TextGeneration
model = TextGeneration(model="hf:neuralmagic/mpt-7b-chat-pruned50-quant")
model("Tell me a joke.", max_new_tokens=50)
Downloads last month
22
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Space using RedHatAI/mpt-7b-chat-pruned50-quant-ds 1