Instructions to use scandukuri/mistral-stargate with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use scandukuri/mistral-stargate with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="scandukuri/mistral-stargate")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("scandukuri/mistral-stargate")
model = AutoModelForCausalLM.from_pretrained("scandukuri/mistral-stargate")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use scandukuri/mistral-stargate with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "scandukuri/mistral-stargate"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "scandukuri/mistral-stargate",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/scandukuri/mistral-stargate

SGLang

How to use scandukuri/mistral-stargate with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "scandukuri/mistral-stargate" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "scandukuri/mistral-stargate",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "scandukuri/mistral-stargate" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "scandukuri/mistral-stargate",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use scandukuri/mistral-stargate with Docker Model Runner:
```
docker model run hf.co/scandukuri/mistral-stargate
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

STaR-GATE

This repository contains the model from the main experiment for STaR-GATE: Teaching Language Models to Ask Clarifying Questions. The weights contained in this repository are represented by the blue line in the left-side win-rate graph below.

When prompting language models to complete a task, users often leave important aspects unsaid. While asking questions could resolve this ambiguity (GATE; Li et al., 2023), models often struggle to ask good questions. We explore a language model's ability to self-improve (STaR; Zelikman et al., 2022) by rewarding the model for generating useful questions-a simple method we dub STaR-GATE. We generate a synthetic dataset of 25,500 unique persona-task prompts to simulate conversations between a pretrained language model-the Questioner-and a Roleplayer whose preferences are unknown to the Questioner. By asking questions, the Questioner elicits preferences from the Roleplayer. The Questioner is iteratively finetuned on questions that increase the probability of high-quality responses to the task, which are generated by an Oracle with access to the Roleplayer's latent preferences. After two iterations of self-improvement, the Questioner asks better questions, allowing it to generate responses that are preferred over responses from the initial model on 72% of tasks. Our results indicate that teaching a language model to ask better questions leads to better personalized responses.

fig_3

Usage

Reference the paper appendix sections A.5.2 (Figure 14: Questioner Elicitation Prompt) and A.6.2 (Figure 17: Questioner Win-Rate Response Prompt.) to see how you can prompt the model for elicitation or for final responses. All code and data for the project can be found here.

Downloads last month: 6

Safetensors

Model size

7B params

Tensor type

BF16

Model tree for scandukuri/mistral-stargate

Quantizations

2 models

Paper for scandukuri/mistral-stargate

STaR-GATE: Teaching Language Models to Ask Clarifying Questions

Paper • 2403.19154 • Published Mar 28, 2024