Instructions to use verseAI/vai-GPT-NeoXT-Chat-Base-20B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use verseAI/vai-GPT-NeoXT-Chat-Base-20B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="verseAI/vai-GPT-NeoXT-Chat-Base-20B")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("verseAI/vai-GPT-NeoXT-Chat-Base-20B") model = AutoModelForCausalLM.from_pretrained("verseAI/vai-GPT-NeoXT-Chat-Base-20B") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use verseAI/vai-GPT-NeoXT-Chat-Base-20B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "verseAI/vai-GPT-NeoXT-Chat-Base-20B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "verseAI/vai-GPT-NeoXT-Chat-Base-20B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/verseAI/vai-GPT-NeoXT-Chat-Base-20B
- SGLang
How to use verseAI/vai-GPT-NeoXT-Chat-Base-20B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "verseAI/vai-GPT-NeoXT-Chat-Base-20B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "verseAI/vai-GPT-NeoXT-Chat-Base-20B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "verseAI/vai-GPT-NeoXT-Chat-Base-20B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "verseAI/vai-GPT-NeoXT-Chat-Base-20B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use verseAI/vai-GPT-NeoXT-Chat-Base-20B with Docker Model Runner:
docker model run hf.co/verseAI/vai-GPT-NeoXT-Chat-Base-20B
Commit History
Update handler.py 2d3a261
Update handler.py 0ce914b
first change to make as q-a pipeline 5521db8
use pipeline ac360bc
manish commited on
change return type 28e5926
manish commited on
add requirements file for dependencies 0d9a7b6
manish commited on
edit readme to link to source of fork 22aaca8
manish commited on
add custom handler and gitignore a6c74eb
manish commited on