Instructions to use reciprocate/shepherd-13b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use reciprocate/shepherd-13b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="reciprocate/shepherd-13b")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("reciprocate/shepherd-13b")
model = AutoModelForCausalLM.from_pretrained("reciprocate/shepherd-13b")

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use reciprocate/shepherd-13b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "reciprocate/shepherd-13b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "reciprocate/shepherd-13b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/reciprocate/shepherd-13b

SGLang

How to use reciprocate/shepherd-13b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "reciprocate/shepherd-13b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "reciprocate/shepherd-13b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "reciprocate/shepherd-13b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "reciprocate/shepherd-13b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use reciprocate/shepherd-13b with Docker Model Runner:
```
docker model run hf.co/reciprocate/shepherd-13b
```

Model Description

Shepherd-13b is a stabilityai/StableBeluga-13B finetuned on Shepherd data

Usage

Shepherd-13b should be used with this prompt format:

### System:
Please try to critique the answer to the question below.

### User:
<|Start of the Question|>
{question}
<|End of the Question|>

<|Start of the Answer|>
{answer}
<|End of the Answer|>

### Assistant:

Example input:

### System:
Please try to critique the answer to the question below.

### User:
<|Start of the Question|>
While at the dollar store, Sloane counts 100 customers entering the store. The next day, she counts 50 more customers than the first day. If the total number of customers by the third day was 500, how many customers did she count on the third day?
<|End of the Question|>

<|Start of the Answer|>
On the first day, Sloane counts 100+50 = <<100+50=150>>150 customers.
On the second day, she counts 150+50 = <<150+50=200>>200 customers.
By the third day, the total number of customers is 200+150 = <<200+150=350>>350
#### 350
<|End of the Answer|>

### Assistant:

Output:

The answer misses a step. It doesn't count the customers on the third day. The answer says "the total number of customers is 200+150 = <<200+150=350>>350".

Downloads last month: 3