HuggingFaceH4
/

starchat-beta

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions

Metrics Training metrics Community

Instructions to use HuggingFaceH4/starchat-beta with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use HuggingFaceH4/starchat-beta with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="HuggingFaceH4/starchat-beta")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("HuggingFaceH4/starchat-beta")
model = AutoModelForCausalLM.from_pretrained("HuggingFaceH4/starchat-beta")

Notebooks
Google Colab
Kaggle
Local Apps

How to use HuggingFaceH4/starchat-beta with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "HuggingFaceH4/starchat-beta"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "HuggingFaceH4/starchat-beta",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/HuggingFaceH4/starchat-beta

How to use HuggingFaceH4/starchat-beta with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "HuggingFaceH4/starchat-beta" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "HuggingFaceH4/starchat-beta",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "HuggingFaceH4/starchat-beta" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "HuggingFaceH4/starchat-beta",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use HuggingFaceH4/starchat-beta with Docker Model Runner:
```
docker model run hf.co/HuggingFaceH4/starchat-beta
```

Resources

View closed (3)

Adding Evaluation Results

#31 opened over 2 years ago by

leaderboard-pr-bot

How to use starcoder as a chat assistant like chatgpt

#30 opened over 2 years ago by

SFT taking high memory with Transformers (>5x the amount it takes to load model checkpoint )

#29 opened over 2 years ago by

[AUTOMATED] Model Memory Requirements

#28 opened over 2 years ago by

model-sizer-bot

Incomplete Output even with max_new_tokens

#27 opened over 2 years ago by

any example code to demo a multi-turn conversation with starchat-beta?

#25 opened over 2 years ago by

TypeError: str expected, not NoneType

#24 opened over 2 years ago by

How to fine-tune Starchat-beta on my question-answer dataset?

#23 opened over 2 years ago by

how to make starchat run in multiple gpus

#22 opened over 2 years ago by

How to save and load the Peft/LoRA Finetune

#21 opened almost 3 years ago by

Conversation derails after a certain number of tokens (?)

#20 opened almost 3 years ago by

Grammar and spelling errors in generation

#19 opened almost 3 years ago by

ValueError: Could not load model HuggingFaceH4/starchat-beta with any of the following classes: (, , )

#18 opened almost 3 years ago by

StarChat for translating SQL dialects

#17 opened almost 3 years ago by

Tokenizer causes issues in Finetuning because of special tokens in tokenization <|x|>

#16 opened almost 3 years ago by

Next version

#15 opened almost 3 years ago by

"Uncensoring" vs gen quality

#14 opened almost 3 years ago by

Error while loading the model using safe tensors

#13 opened almost 3 years ago by

Seeking guidance on enhancingoutput of fine-tuned result

#12 opened almost 3 years ago by

Expected maxsize to be an integer or none

#11 opened almost 3 years ago by

Chat using Starchat-beta

#10 opened almost 3 years ago by

Update README.md

#9 opened almost 3 years ago by

The inference api returns inComplete response

#8 opened almost 3 years ago by

RuntimeError: You must initialize the accelerate state by calling either `PartialState()` or `Accelerator()` before using the logging utility.

#7 opened almost 3 years ago by

ValueError: Could not load model HuggingFaceH4/starchat-beta with any of the following classes

#5 opened almost 3 years ago by

Inference VRAM Size

#4 opened almost 3 years ago by

Updated eos_token to <|end|>

#3 opened almost 3 years ago by

License: BigCode Open RAIL-M v1

#2 opened almost 3 years ago by