Instructions to use T145/ZEUS-8B-V2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use T145/ZEUS-8B-V2 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="T145/ZEUS-8B-V2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("T145/ZEUS-8B-V2")
model = AutoModelForCausalLM.from_pretrained("T145/ZEUS-8B-V2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use T145/ZEUS-8B-V2 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "T145/ZEUS-8B-V2"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "T145/ZEUS-8B-V2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/T145/ZEUS-8B-V2

SGLang

How to use T145/ZEUS-8B-V2 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "T145/ZEUS-8B-V2" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "T145/ZEUS-8B-V2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "T145/ZEUS-8B-V2" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "T145/ZEUS-8B-V2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use T145/ZEUS-8B-V2 with Docker Model Runner:
```
docker model run hf.co/T145/ZEUS-8B-V2
```

[Community] Please share your prompts and experiences: positive or negative!

by T145 - opened Dec 10, 2024

Discussion

T145

Owner Dec 10, 2024

I do a lot of my own testing, but it'd be nice to hear from other users if they've had any significantly positive or negative experiences using this model. When posting, please be sure to include the quant used (as quants below a static Q8_0 may have precision loss) and inference settings. Thanks, GLHF!

T145

Owner Dec 16, 2024

This comment has been hidden

T145

Owner Dec 16, 2024

@John6666 Have you performed any tests on the Zeus models? You seem to be a fan!

John6666

Dec 17, 2024

Hello, I'm testing LLM by giving it a theme, having it come up with a story, and then having it generate a caption-style image prompt. It's a pretty inefficient way of testing prompt generation, but it's fun to see how LLM responds. The theme is given in NSFW Japanese, and the prompt is output in English, so it also requires multilingual performance. It's not really fair, but it's a kind of benchmark.😅
Fewer than half of the models are able to reach the prompt with the result. There are also cases where the response is refused, but this is because they need to overcome various challenges, such as language comprehension, command comprehension, real-world knowledge, vocabulary, and creativity. I always give Likes to LLM that have overcome these challenges. This does not mean that I will not give Likes otherwise.
The ZEUS series consistently generates prompts as a result of the test. If I notice anything special during the test in the future, I will come here to write about it.😀

T145

Owner Dec 17, 2024

Interesting; do you have a more SFW example you could show? If it's in JP that's OK. I'd be particularly interested if you have an example that you've used across across a couple models! V2, V8, and V2-ORPO are my current recommendations for regular use.

John6666

Dec 17, 2024

I haven't tried many SFW examples...
I think some LLM leaderboards will evaluate them numerically.
There are cases where I try to include very simple SFW system prompts and user prompts to see if the model that couldn't complete the above tasks in a few inputs is broken or not. In the case of ZEUS, it has not been tested because it completed the task.
I don't have much opportunity to evaluate individual models in detail...
Of course I could test ZEUS's Japanese language capabilities, but there is no reliable comparison in my brain.

Looking at the number of downloads of GGUF on HF in general, I think that someone somewhere in the world is using it...
There is very little feedback on the Hub. Well, that's how OSS is, including AI.🙄
There are also model authors who have set up their own demos and are collecting history on their data sets. I haven't read the source code in detail, but I think that would be useful for testing the model's operation.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment