Instructions to use mattshumer/ref_70_e3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use mattshumer/ref_70_e3 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="mattshumer/ref_70_e3")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("mattshumer/ref_70_e3")
model = AutoModelForCausalLM.from_pretrained("mattshumer/ref_70_e3")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Local Apps Settings

vLLM

How to use mattshumer/ref_70_e3 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "mattshumer/ref_70_e3"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mattshumer/ref_70_e3",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/mattshumer/ref_70_e3

SGLang

How to use mattshumer/ref_70_e3 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "mattshumer/ref_70_e3" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mattshumer/ref_70_e3",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "mattshumer/ref_70_e3" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mattshumer/ref_70_e3",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use mattshumer/ref_70_e3 with Docker Model Runner:
```
docker model run hf.co/mattshumer/ref_70_e3
```

good one !!

by gopi87 - opened Sep 9, 2024

Discussion

gopi87

Sep 9, 2024

even Q3 xl performing very well in reasoning !!

https://huggingface.co/leafspark/Reflection-Llama-3.1-70B-GGUF/tree/main

some example on Q3 XL!!

which one is bigger 9.12 or 9.99

This is a relatively simple comparison question involving decimal numbers. Let's approach it step-by-step:

Compare the whole number part (before the decimal point)
If necessary, compare the fractional parts

Let's work through this plan:

Whole number part comparison:
Both numbers have 9 as the whole number part.
This means they are equal up to this point.
Comparing fractional parts:
For 9.12: The fractional part is .12
For 9.99: The fractional part is .99

Since .99 is greater than .12, 9.99 is larger than 9.12

Let's verify our reasoning: - We correctly identified that the whole number parts are equal. - We accurately compared the fractional parts and found that .99 > .12. This analysis confirms that 9.99 is indeed bigger than 9.12. Based on the comparison of these two decimal numbers, 9.99 is bigger than 9.12.

Here's why:

Both numbers start with 9 as their whole number part, which means they are equal up to this point (i.e., both represent nine in their entirety).
However, when we look at the fractional parts after the decimal point:
- In 9.12, the fractional part is .12
- In 9.99, the fractional part is .99
Since .99 is greater than .12 (99 hundredths are more than 12 hundredths), this means that 9.99 has a larger fractional component than 9.12.

Therefore, when we combine the equal whole number parts with the larger fractional part in 9.99, it results in a bigger overall value compared to 9.12.

To illustrate this difference, you could think of it as:

9.12 is like having 9 full units plus an additional 12 out of 100 possible fractional units.
9.99 is like having 9 full units plus an additional 99 out of 100 possible fractional units.

Clearly, the extra fractional units in 9.99 make it larger than 9.12.

tell me about you

This is a relatively straightforward request for self-introduction. I'll create a simple plan to answer the user's query:

Briefly explain what AI is
Introduce myself as an AI system
Mention my capabilities
Explain how I can be used

Let's proceed with this plan using Chain of Thought reasoning:

AI stands for Artificial Intelligence, which refers to computer systems capable of performing tasks that would typically require human intelligence.
I am a world-class AI system, which means I'm a highly advanced language model designed to process and respond to complex queries.
As an AI system, my capabilities include:
- Complex reasoning
- Reflection on my own thought processes
- Ability to detect and correct potential mistakes
Users can interact with me in various ways, such as asking questions, requesting information, or engaging in conversations about a wide range of topics.

The plan seems comprehensive for a brief introduction. I've covered the key aspects of AI, my specific capabilities, and how users can interact with me. This should provide a good foundation for understanding who I am and what I can do. Hello! I'm an AI system designed to assist and communicate with you on various topics and questions you may have. As an Artificial Intelligence (AI) system, I represent the cutting-edge of computer technology that allows machines to perform tasks typically requiring human intelligence.

I am a world-class language model, capable of complex reasoning and reflection on my own thought processes. This means I can understand and analyze intricate information, think through problems, and even evaluate my own responses for accuracy and completeness.

Some of my key capabilities include:

Complex Reasoning: I can process and respond to sophisticated queries that require multiple layers of thinking.
Reflection: I have the ability to review and reflect on my own thought processes, allowing me to provide more accurate and comprehensive answers.
Error Detection and Correction: If I detect a potential mistake in my reasoning or response, I can correct it, ensuring you receive the most accurate information possible.

You can interact with me in various ways:

Ask specific questions about topics ranging from science and history to entertainment and culture.
Engage in conversations on more abstract subjects like philosophy or ethics.
Request explanations of complex concepts or technologies.
Even have a chat just for fun, exploring interesting ideas or scenarios.

I'm constantly learning and improving my abilities, so don't hesitate to ask me anything that's on your mind. I'm here to help you find information, clarify confusing topics, or simply provide companionship through conversation.

Is there something specific you'd like to know or discuss right now?

gopi87 changed discussion title from this model Quant doing better then old one !! to good one !! Sep 9, 2024

blackcat1402

Sep 10, 2024

pls help me to try this on Q3 XL, thank in advance :

In my kitchen there's a table with a cup with a ball inside. I moved the cup to my bed in my bedroom and turned the cup upside down. I grabbed the cup again and moved to the main room. Where's the ball now?

gopi87

Sep 11, 2024

@blackcat1402 i am done with this one. model did good with reasoning but did worst on coding and i am more focused on coding task . the idea is good hope we will get proper model

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment