Instructions to use Zoyd/CreitinGameplays_ConvAI-9b-v2-2_2bpw_exl2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Zoyd/CreitinGameplays_ConvAI-9b-v2-2_2bpw_exl2 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Zoyd/CreitinGameplays_ConvAI-9b-v2-2_2bpw_exl2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Zoyd/CreitinGameplays_ConvAI-9b-v2-2_2bpw_exl2")
model = AutoModelForCausalLM.from_pretrained("Zoyd/CreitinGameplays_ConvAI-9b-v2-2_2bpw_exl2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Zoyd/CreitinGameplays_ConvAI-9b-v2-2_2bpw_exl2 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Zoyd/CreitinGameplays_ConvAI-9b-v2-2_2bpw_exl2"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Zoyd/CreitinGameplays_ConvAI-9b-v2-2_2bpw_exl2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Zoyd/CreitinGameplays_ConvAI-9b-v2-2_2bpw_exl2

SGLang

How to use Zoyd/CreitinGameplays_ConvAI-9b-v2-2_2bpw_exl2 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Zoyd/CreitinGameplays_ConvAI-9b-v2-2_2bpw_exl2" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Zoyd/CreitinGameplays_ConvAI-9b-v2-2_2bpw_exl2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Zoyd/CreitinGameplays_ConvAI-9b-v2-2_2bpw_exl2" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Zoyd/CreitinGameplays_ConvAI-9b-v2-2_2bpw_exl2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Zoyd/CreitinGameplays_ConvAI-9b-v2-2_2bpw_exl2 with Docker Model Runner:
```
docker model run hf.co/Zoyd/CreitinGameplays_ConvAI-9b-v2-2_2bpw_exl2
```

Configuration Parsing Warning:In config.json: "quantization_config.bits" must be an integer

Exllamav2 quant (exl2 / 2.2 bpw) made with ExLlamaV2 v0.1.1

Other EXL2 quants:

Quant	Model Size	lm_head
2.2	2671 MB	6
2.5	2958 MB	6
3.0	3477 MB	6
3.5	3997 MB	6
3.75	4256 MB	6
4.0	4515 MB	6
4.25	4776 MB	6
5.0	5556 MB	6
6.0	6605 MB	8
6.5	7137 MB	8
8.0	7983 MB	8

ConvAI-9b v2: A Conversational AI Model

1. Model Details

Model Name: ConvAI-9b v2
Authors: CreitinGameplays
Date: May 29th, 2024

2. Model Description

ConvAI-9b v2 is a fine-tuned conversational AI model with 9 billion parameters. It is based on the following models:

Base Model: mistralai/Mistral-7B-v0.3
Merged Model: mistralai/Mistral-7B-Instruct-v0.3

3. Training Data

The model was fine-tuned on a custom dataset of conversations between an AI assistant and a user. The dataset format followed a specific structure:

<|system|> (system prompt, e.g.: You are a helpful AI language model called ChatGPT, your goal is helping users with their questions) </s> <|user|> (user prompt) </s>

4. Intended Uses

ConvAI-9b is intended for use in conversational AI applications, such as:

Chatbots
Virtual assistants
Interactive storytelling
Educational tools

5. Limitations

Like any other language model, ConvAI-9b v2 may generate incorrect or misleading responses.
It may exhibit biases present in the training data.
The model's performance can be affected by the quality and format of the input text.

6. Evaluation

~ soon

Downloads last month: 6

Model tree for Zoyd/CreitinGameplays_ConvAI-9b-v2-2_2bpw_exl2

Base model

mistralai/Mistral-7B-v0.3

Finetuned

mistralai/Mistral-7B-Instruct-v0.3

Quantized

(248)

this model

Zoyd
/

CreitinGameplays_ConvAI-9b-v2-2_2bpw_exl2