Instructions to use openchat/openchat_8192 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openchat/openchat_8192 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="openchat/openchat_8192")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("openchat/openchat_8192") model = AutoModelForCausalLM.from_pretrained("openchat/openchat_8192") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use openchat/openchat_8192 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "openchat/openchat_8192" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "openchat/openchat_8192", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/openchat/openchat_8192
- SGLang
How to use openchat/openchat_8192 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "openchat/openchat_8192" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "openchat/openchat_8192", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "openchat/openchat_8192" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "openchat/openchat_8192", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use openchat/openchat_8192 with Docker Model Runner:
docker model run hf.co/openchat/openchat_8192
Example of usage
Hi,
can you provide an example of usage? I couldn't find anything either here or on the referenced github repository.
Thanks!
可以支持中文吗
Hi,
can you provide an example of usage? I couldn't find anything either here or on the referenced github repository.
Thanks!
You can directly use our model in Transformers. Our usage templates are shown in the "Conversation Template". And we will update an example of usage. Thanks for your reminder!
可以支持中文吗
不好意思,受限于基座模型,以及我们仅使用英文数据SFT,所以目前还不支持中文。
Hi,
can you provide an example of usage? I couldn't find anything either here or on the referenced github repository.
Thanks!
You can directly use our model in Transformers. Our usage templates are shown in the "Conversation Template". And we will update an example of usage. Thanks for your reminder!
Thanks for the answer. Can you help me understand what are " tokenize_fn, tokenize_special_fn", used in your "Conversation Template"? Thanks!
I am not able to run this model locally, do you have any resources how I can? Or any script?