Instructions to use ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit")
model = AutoModelForCausalLM.from_pretrained("ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit

SGLang

How to use ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit with Docker Model Runner:
```
docker model run hf.co/ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit
```

Description

4 bit quantization of upstage/SOLAR-10.7B-Instruct-v1.0 using GPTQ. We use the config below for quantization/evaluation and HuggingFaceH4/ultrachat_200k as the calibration data. The code is available under this repository.

bits: 4
damp_percent: 0.01
desc_act: true
exllama_config:
 version: 2
group_size: 128
quant_method: gptq
static_groups: false
sym: true
true_sequential: true

Evaluations

Below is a comprehensive evaluation using the awesome mosaicml/llm-foundry.

model_name	core_average	world_knowledge	commonsense_reasoning	language_understanding	symbolic_problem_solving	reading_comprehension
upstage/SOLAR-10.7B-Instruct-v1.0	0.594131	0.602579	0.600195	0.747605	0.406245	0.614029

Category	Benchmark	Subtask	Accuracy	Number few shot
symbolic_problem_solving	gsm8k		0.638362	0-shot
commonsense_reasoning	copa		0.84	0-shot
commonsense_reasoning	commonsense_qa		0.841933	0-shot
commonsense_reasoning	piqa		0.818281	0-shot
commonsense_reasoning	bigbench_strange_stories		0.793103	0-shot
commonsense_reasoning	bigbench_strategy_qa		0.66623	0-shot
language_understanding	lambada_openai		0.735882	0-shot
language_understanding	hellaswag		0.855208	0-shot
reading_comprehension	coqa		0.222723	0-shot
reading_comprehension	boolq		0.893884	0-shot
world_knowledge	triviaqa_sm_sub		0.628333	3-shot
world_knowledge	jeopardy	Average	0.500792	3-shot
world_knowledge		american_history	0.581114	3-shot
world_knowledge		literature	0.655102	3-shot
world_knowledge		science	0.371849	3-shot
world_knowledge		word_origins	0.271233	3-shot
world_knowledge		world_history	0.624665	3-shot
world_knowledge	bigbench_qa_wikidata		0.669209	3-shot
world_knowledge	arc_easy		0.815657	3-shot
world_knowledge	arc_challenge		0.650171	3-shot
commonsense_reasoning	siqa		0.881781	3-shot
language_understanding	winograd		0.897436	3-shot
symbolic_problem_solving	bigbench_operators		0.595238	3-shot
reading_comprehension	squad		0.626395	3-shot
symbolic_problem_solving	svamp		0.603333	5-shot
world_knowledge	mmlu	Average	0.647028	5-shot
world_knowledge		abstract_algebra	0.29	5-shot
world_knowledge		anatomy	0.577778	5-shot
world_knowledge		astronomy	0.710526	5-shot
world_knowledge		business_ethics	0.73	5-shot
world_knowledge		clinical_knowledge	0.701887	5-shot
world_knowledge		college_biology	0.729167	5-shot
world_knowledge		college_chemistry	0.39	5-shot
world_knowledge		college_computer_science	0.5	5-shot
world_knowledge		college_mathematics	0.31	5-shot
world_knowledge		college_medicine	0.66474	5-shot
world_knowledge		college_physics	0.411765	5-shot
world_knowledge		computer_security	0.72	5-shot
world_knowledge		conceptual_physics	0.582979	5-shot
world_knowledge		econometrics	0.473684	5-shot
world_knowledge		electrical_engineering	0.565517	5-shot
world_knowledge		elementary_mathematics	0.470899	5-shot
world_knowledge		formal_logic	0.460317	5-shot
world_knowledge		global_facts	0.33	5-shot
world_knowledge		high_school_biology	0.770968	5-shot
world_knowledge		high_school_chemistry	0.448276	5-shot
world_knowledge		high_school_computer_science	0.71	5-shot
world_knowledge		high_school_european_history	0.830303	5-shot
world_knowledge		high_school_geography	0.848485	5-shot
world_knowledge		high_school_government_and_politics	0.896373	5-shot
world_knowledge		high_school_macroeconomics	0.646154	5-shot
world_knowledge		high_school_mathematics	0.348148	5-shot
world_knowledge		high_school_microeconomics	0.722689	5-shot
world_knowledge		high_school_physics	0.344371	5-shot
world_knowledge		high_school_psychology	0.833028	5-shot
world_knowledge		high_school_statistics	0.523148	5-shot
world_knowledge		high_school_us_history	0.852941	5-shot
world_knowledge		high_school_world_history	0.827004	5-shot
world_knowledge		human_aging	0.713004	5-shot
world_knowledge		human_sexuality	0.755725	5-shot
world_knowledge		international_law	0.768595	5-shot
world_knowledge		jurisprudence	0.796296	5-shot
world_knowledge		logical_fallacies	0.723926	5-shot
world_knowledge		machine_learning	0.508929	5-shot
world_knowledge		management	0.825243	5-shot
world_knowledge		marketing	0.871795	5-shot
world_knowledge		medical_genetics	0.73	5-shot
world_knowledge		miscellaneous	0.814815	5-shot
world_knowledge		moral_disputes	0.736994	5-shot
world_knowledge		moral_scenarios	0.43352	5-shot
world_knowledge		nutrition	0.728758	5-shot
world_knowledge		philosophy	0.700965	5-shot
world_knowledge		prehistory	0.765432	5-shot
world_knowledge		professional_accounting	0.507092	5-shot
world_knowledge		professional_law	0.487614	5-shot
world_knowledge		professional_medicine	0.727941	5-shot
world_knowledge		professional_psychology	0.661765	5-shot
world_knowledge		public_relations	0.718182	5-shot
world_knowledge		security_studies	0.669388	5-shot
world_knowledge		sociology	0.81592	5-shot
world_knowledge		us_foreign_policy	0.89	5-shot
world_knowledge		virology	0.518072	5-shot
world_knowledge		world_religions	0.789474	5-shot
symbolic_problem_solving	bigbench_dyck_languages		0.458	5-shot
language_understanding	winogrande		0.826361	5-shot
symbolic_problem_solving	agi_eval_lsat_ar		0.269565	5-shot
symbolic_problem_solving	simple_arithmetic_nospaces		0.372	5-shot
symbolic_problem_solving	simple_arithmetic_withspaces		0.367	5-shot
reading_comprehension	agi_eval_lsat_rc		0.794776	5-shot
reading_comprehension	agi_eval_lsat_lr		0.641176	5-shot
reading_comprehension	agi_eval_sat_en		0.849515	5-shot
world_knowledge	arc_challenge		0.670648	25-shot
commonsense_reasoning	openbook_qa		0.56	10-shot
language_understanding	hellaswag		0.866461	10-shot
	bigbench_cs_algorithms		0.652273	10-shot
symbolic_problem_solving	bigbench_elementary_math_qa		0.392453	1-shot

Downloads last month: 6

Safetensors

Model size

11B params

Tensor type

I32

F16