Instructions to use khazarai/MentalChat-16K with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use khazarai/MentalChat-16K with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="khazarai/MentalChat-16K") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("khazarai/MentalChat-16K") model = AutoModelForCausalLM.from_pretrained("khazarai/MentalChat-16K") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use khazarai/MentalChat-16K with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "khazarai/MentalChat-16K" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "khazarai/MentalChat-16K", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/khazarai/MentalChat-16K
- SGLang
How to use khazarai/MentalChat-16K with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "khazarai/MentalChat-16K" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "khazarai/MentalChat-16K", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "khazarai/MentalChat-16K" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "khazarai/MentalChat-16K", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio new
How to use khazarai/MentalChat-16K with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for khazarai/MentalChat-16K to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for khazarai/MentalChat-16K to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for khazarai/MentalChat-16K to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="khazarai/MentalChat-16K", max_seq_length=2048, ) - Docker Model Runner
How to use khazarai/MentalChat-16K with Docker Model Runner:
docker model run hf.co/khazarai/MentalChat-16K
Model Card for Model MentalChat-16K
Model Details
This model is a fine-tuned version of Llama-3.2-1B-Instruct, optimized for empathetic and supportive conversations in the mental health domain. It was trained on the ShenLab/MentalChat16K dataset, which includes over 16,000 counseling-style Q&A examples, combining real clinical paraphrases and synthetic mental health dialogues. The model is designed to understand and respond to emotionally nuanced prompts related to stress, anxiety, relationships, and personal well-being.
Model Description
- Language(s) (NLP): English
- License: MIT
- Finetuned from model: unsloth/Llama-3.2-1B-Instruct
- Dataset: ShenLab/MentalChat16K
Uses
This model is intended for research and experimentation in AI-driven mental health support. Key use cases include:
- Mental health chatbot prototypes
- Empathy-focused dialogue agents
- Benchmarking LLMs on emotional intelligence and counseling-style prompts
- Educational or training tools in psychology or mental health communication
This model is NOT intended for clinical diagnosis, therapy, or real-time intervention. It must not replace licensed mental health professionals.
Bias, Risks, and Limitations
Biases:
- The real interview data is biased toward caregivers (mostly White, female, U.S.-based), which may affect the model’s cultural and demographic generalizability.
- The synthetic dialogues are generated by GPT-3.5, which may introduce linguistic and cultural biases from its pretraining.
Limitations:
- The base model, Qwen2.5-0.5B-Instruct, is a small model (0.5B parameters), limiting depth of reasoning and nuanced understanding.
- Not suitable for handling acute mental health crises or emergency counseling.
- Responses may lack therapeutic rigor or miss subtle psychological cues.
- May produce hallucinated or inaccurate mental health advice.
How to Get Started with the Model
Use the code below to get started with the model.
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("khazarai/MentalChat-16K")
model = AutoModelForCausalLM.from_pretrained(
"khazarai/MentalChat-16K",
device_map={"": 0}
)
system = """You are a helpful mental health counselling assistant, please answer the mental health questions based on the patient's description.
The assistant gives helpful, comprehensive, and appropriate answers to the user's questions.
"""
question = """
I've been feeling overwhelmed by my responsibilities at work and caring for my aging parents. I've reached a point where I don't know what else I can do, and I'm struggling to communicate this to my boss and family members. I feel guilty for even considering saying no, but I know I need to take care of myself.
"""
messages = [
{"role" : "system", "content" : system},
{"role" : "user", "content" : question}
]
text = tokenizer.apply_chat_template(
messages,
tokenize = False,
add_generation_prompt = True,
)
from transformers import TextStreamer
_ = model.generate(
**tokenizer(text, return_tensors = "pt").to("cuda"),
max_new_tokens = 900,
temperature = 0.7,
top_p = 0.8,
top_k = 20,
streamer = TextStreamer(tokenizer, skip_prompt = True),
)
- Downloads last month
- 8