Instructions to use colawolfie/Chen-9B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use colawolfie/Chen-9B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="colawolfie/Chen-9B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("colawolfie/Chen-9B")
model = AutoModelForMultimodalLM.from_pretrained("colawolfie/Chen-9B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use colawolfie/Chen-9B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "colawolfie/Chen-9B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "colawolfie/Chen-9B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/colawolfie/Chen-9B

SGLang

How to use colawolfie/Chen-9B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "colawolfie/Chen-9B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "colawolfie/Chen-9B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "colawolfie/Chen-9B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "colawolfie/Chen-9B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use colawolfie/Chen-9B with Docker Model Runner:
```
docker model run hf.co/colawolfie/Chen-9B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Chen-9B

Character.AI + Qwen = Ch-en

Description

Chen-9B is a fine-tune of Qwen/Qwen3.5-9B-Base which aims to replicate the chat style of the old "c.ai 1.1" model from Character.AI. It does this by directly distilling from Character.AI with the help of the PIPPA dataset from PygmalionAI and a private set of datasets collected from sources that Character.AI would have likely used themselves for the training of their c.ai 1.1 model. The model is somewhat of a success and can closely emulate the old Character.AI... a bit too well (it even has some of the old quirks such as human chracters sprouting tails or going into the loop of "c-can i ask you a question?"/"promise you won't get mad?").

It was made for nostalgic purposes as well as to take a look at Character.AI's old token distributions (or at least something close to them) for my curiosity, but I am sharing this model in case others wanted to re-live old memories, study the model/curiosity, attempt to roleplay with it, or just to play about with it. It was trained in 4-bit QLoRA mode, but I merged it into a full bfloat16 model for your convenience and for the ease of deployment. If you would like the LoRA itself in case you already have a copy of Qwen3.5-9B-Base, or would like a ready-to-use self-contained GGUF to run the model, check these links respectively:

GGUF: https://huggingface.co/colawolfie/Chen-9B-GGUF
LoRA: https://huggingface.co/colawolfie/Chen-9B-LoRA

Example & Usage

Basics:

Chat template: Normal ChatML (no reasoning)
Recommended sampling parameters (play around! try weird samplers like XTC too!): temp = 0.8 - 1.2, top_k = 40, top_p = 0.95, min_p = 0.05
Repetition penalty: As a lot of the source data was partly generated by old Character.AI, which used now dated LLM technologies (and probably a weirdly high temperature), you may experience repetition. Use rep_pen = 1.01 - 1.09 to help mitigate this.

This is not an instruct-tuned model and it expects that the system prompt is a description of the character that the LLM is meant to roleplay as, however, you may find that despite this, the model will follow instructions which potentially is due to how Qwen3.5-9B-Base was trained (e.g. if it saw AI assistant conversations during pre-training or had at least a tiny bit of instruct baked in despite being labelled as a "base"). If you do not provide a character in the system prompt or leave it empty, be prepared because you will get either an AI assistant or a random character!

Anyway, here is a within-scope usage example (you should use ChatML, this is just an example):

system: Leo is a fluffy male anthro wolf with brown and cream fur who loves to chat.
user: hey
assistant: *Leo turns to face the person, his fluffy tail wagging a bit.* "Hey there, what's up?"

Conversely, no system prompt (out-of-scope usage, but sometimes fun):

user: hey
assistant: hey there :) how are you?

These are real responses from the model and as you can see, it behaves just like the old Character.AI. Be warned that there is no built-in content filter or safety alignment, so keep that in mind if you wish to serve the model publicly via an API or app to users.

Scope

This model has only been SFT-tuned on Character.AI outputs (PIPPA/my own dump), a small amount of AO3, and Discord chat/roleplay servers, meaning it may occasionally generate NSFW or otherwise offensive content if prompted (or even spontaneously). Additionally, no DPO or RLHF/RLAIF has been done to ground the model in truth, high-quality coding, or other instruct tasks and it should never be used as a source of factual information.

You have been warned. This model is only intended to be studied or used for entertainment purposes such as roleplay, creative writing, or other experiments.

License

This model, like the original Qwen3.5-9B-Base from Alibaba, is licensed under the Apache license 2.0. If you're an API provider, app developer, or otherwise, you are fully permitted without restriction to serve this model under your own terms (provided you follow the Apache license 2.0) if you wish to do so.

Downloads last month: 18

Safetensors

Model size

9B params

Tensor type

BF16

Model tree for colawolfie/Chen-9B

Base model

Qwen/Qwen3.5-9B-Base

Finetuned

(120)

this model

Quantizations

2 models

colawolfie
/

Chen-9B