Text Generation
Transformers
PyTorch
TensorBoard
Safetensors
gpt_bigcode
Generated from Trainer
text-generation-inference
Instructions to use HuggingFaceH4/starchat-beta with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use HuggingFaceH4/starchat-beta with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="HuggingFaceH4/starchat-beta")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("HuggingFaceH4/starchat-beta") model = AutoModelForCausalLM.from_pretrained("HuggingFaceH4/starchat-beta") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use HuggingFaceH4/starchat-beta with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "HuggingFaceH4/starchat-beta" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HuggingFaceH4/starchat-beta", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/HuggingFaceH4/starchat-beta
- SGLang
How to use HuggingFaceH4/starchat-beta with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "HuggingFaceH4/starchat-beta" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HuggingFaceH4/starchat-beta", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "HuggingFaceH4/starchat-beta" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HuggingFaceH4/starchat-beta", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use HuggingFaceH4/starchat-beta with Docker Model Runner:
docker model run hf.co/HuggingFaceH4/starchat-beta
Adding Evaluation Results
#31 opened over 2 years ago
by
leaderboard-pr-bot
How to use starcoder as a chat assistant like chatgpt
#30 opened over 2 years ago
by
Rajath-jain
SFT taking high memory with Transformers (>5x the amount it takes to load model checkpoint )
#29 opened over 2 years ago
by
vermanic
[AUTOMATED] Model Memory Requirements
#28 opened over 2 years ago
by
model-sizer-bot
Incomplete Output even with max_new_tokens
6
#27 opened over 2 years ago
by
vermanic
any example code to demo a multi-turn conversation with starchat-beta?
1
#25 opened over 2 years ago
by
alfred78
TypeError: str expected, not NoneType
#24 opened over 2 years ago
by
lmw41
How to fine-tune Starchat-beta on my question-answer dataset?
1
#23 opened over 2 years ago
by
ai-anytime
how to make starchat run in multiple gpus
#22 opened over 2 years ago
by
edwardtan
How to save and load the Peft/LoRA Finetune
1
#21 opened almost 3 years ago
by
LazerJesus
Conversation derails after a certain number of tokens (?)
👍 1
3
#20 opened almost 3 years ago
by
mindplay
Grammar and spelling errors in generation
#19 opened almost 3 years ago
by
huu-ontocord
ValueError: Could not load model HuggingFaceH4/starchat-beta with any of the following classes: (, , )
👍 1
1
#18 opened almost 3 years ago
by
ManavParikh
StarChat for translating SQL dialects
#17 opened almost 3 years ago
by
dorianmatic
Tokenizer causes issues in Finetuning because of special tokens in tokenization <|x|>
5
#16 opened almost 3 years ago
by
LazerJesus
Next version
#15 opened almost 3 years ago
by
gsaivinay
"Uncensoring" vs gen quality
#14 opened almost 3 years ago
by
ocramz
Error while loading the model using safe tensors
#13 opened almost 3 years ago
by
tasheer10
Seeking guidance on enhancingoutput of fine-tuned result
6
#12 opened almost 3 years ago
by
huytungst
Expected maxsize to be an integer or none
#11 opened almost 3 years ago
by
satsat
Chat using Starchat-beta
👍 3
1
#10 opened almost 3 years ago
by
vitvit
Update README.md
#9 opened almost 3 years ago
by
saattrupdan
The inference api returns inComplete response
2
#8 opened almost 3 years ago
by
aidan377
RuntimeError: You must initialize the accelerate state by calling either `PartialState()` or `Accelerator()` before using the logging utility.
#7 opened almost 3 years ago
by
amarrrv
ValueError: Could not load model HuggingFaceH4/starchat-beta with any of the following classes
3
#5 opened almost 3 years ago
by
hantianwei
Inference VRAM Size
👍 1
6
#4 opened almost 3 years ago
by
tjohnson
Updated eos_token to <|end|>
👍 1
1
#3 opened almost 3 years ago
by
grafail
License: BigCode Open RAIL-M v1
❤️ 2
1
#2 opened almost 3 years ago
by
Asaf-Yehudai