How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="TOFU-SFT/Qwen2.5-Math-7B-4bit")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("TOFU-SFT/Qwen2.5-Math-7B-4bit")
model = AutoModelForCausalLM.from_pretrained("TOFU-SFT/Qwen2.5-Math-7B-4bit")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))
Quick Links

Model Description

  • Developed by: Qwen
  • Model type: Causal Language Models
  • License: Apache 2.0

Citation

@article{yang2024qwen25mathtechnicalreportmathematical,
  title={Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement}, 
  author={An Yang and Beichen Zhang and Binyuan Hui and Bofei Gao and Bowen Yu and Chengpeng Li and Dayiheng Liu and Jianhong Tu and Jingren Zhou and Junyang Lin and Keming Lu and Mingfeng Xue and Runji Lin and Tianyu Liu and Xingzhang Ren and Zhenru Zhang},
  journal={arXiv preprint arXiv:2409.12122},
  year={2024}
}
Downloads last month
59
Safetensors
Model size
8B params
Tensor type
F32
·
BF16
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for TOFU-SFT/Qwen2.5-Math-7B-4bit

Base model

Qwen/Qwen2.5-7B
Quantized
(81)
this model

Collection including TOFU-SFT/Qwen2.5-Math-7B-4bit

Paper for TOFU-SFT/Qwen2.5-Math-7B-4bit