Text Generation
Transformers
Safetensors
llama
conversational
text-generation-inference
4-bit precision
gptq
Instructions to use ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit") model = AutoModelForCausalLM.from_pretrained("ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit
- SGLang
How to use ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit with Docker Model Runner:
docker model run hf.co/ISTA-DASLab/SOLAR-10.7B-Instruct-v1.0-GPTQ-4bit
Description
4 bit quantization of upstage/SOLAR-10.7B-Instruct-v1.0 using GPTQ. We use the config below for quantization/evaluation and HuggingFaceH4/ultrachat_200k as the calibration data. The code is available under this repository.
bits: 4
damp_percent: 0.01
desc_act: true
exllama_config:
version: 2
group_size: 128
quant_method: gptq
static_groups: false
sym: true
true_sequential: true
Evaluations
Below is a comprehensive evaluation using the awesome mosaicml/llm-foundry.
| model_name | core_average | world_knowledge | commonsense_reasoning | language_understanding | symbolic_problem_solving | reading_comprehension |
|---|---|---|---|---|---|---|
| upstage/SOLAR-10.7B-Instruct-v1.0 | 0.594131 | 0.602579 | 0.600195 | 0.747605 | 0.406245 | 0.614029 |
| Category | Benchmark | Subtask | Accuracy | Number few shot |
|---|---|---|---|---|
| symbolic_problem_solving | gsm8k | 0.638362 | 0-shot | |
| commonsense_reasoning | copa | 0.84 | 0-shot | |
| commonsense_reasoning | commonsense_qa | 0.841933 | 0-shot | |
| commonsense_reasoning | piqa | 0.818281 | 0-shot | |
| commonsense_reasoning | bigbench_strange_stories | 0.793103 | 0-shot | |
| commonsense_reasoning | bigbench_strategy_qa | 0.66623 | 0-shot | |
| language_understanding | lambada_openai | 0.735882 | 0-shot | |
| language_understanding | hellaswag | 0.855208 | 0-shot | |
| reading_comprehension | coqa | 0.222723 | 0-shot | |
| reading_comprehension | boolq | 0.893884 | 0-shot | |
| world_knowledge | triviaqa_sm_sub | 0.628333 | 3-shot | |
| world_knowledge | jeopardy | Average | 0.500792 | 3-shot |
| world_knowledge | american_history | 0.581114 | 3-shot | |
| world_knowledge | literature | 0.655102 | 3-shot | |
| world_knowledge | science | 0.371849 | 3-shot | |
| world_knowledge | word_origins | 0.271233 | 3-shot | |
| world_knowledge | world_history | 0.624665 | 3-shot | |
| world_knowledge | bigbench_qa_wikidata | 0.669209 | 3-shot | |
| world_knowledge | arc_easy | 0.815657 | 3-shot | |
| world_knowledge | arc_challenge | 0.650171 | 3-shot | |
| commonsense_reasoning | siqa | 0.881781 | 3-shot | |
| language_understanding | winograd | 0.897436 | 3-shot | |
| symbolic_problem_solving | bigbench_operators | 0.595238 | 3-shot | |
| reading_comprehension | squad | 0.626395 | 3-shot | |
| symbolic_problem_solving | svamp | 0.603333 | 5-shot | |
| world_knowledge | mmlu | Average | 0.647028 | 5-shot |
| world_knowledge | abstract_algebra | 0.29 | 5-shot | |
| world_knowledge | anatomy | 0.577778 | 5-shot | |
| world_knowledge | astronomy | 0.710526 | 5-shot | |
| world_knowledge | business_ethics | 0.73 | 5-shot | |
| world_knowledge | clinical_knowledge | 0.701887 | 5-shot | |
| world_knowledge | college_biology | 0.729167 | 5-shot | |
| world_knowledge | college_chemistry | 0.39 | 5-shot | |
| world_knowledge | college_computer_science | 0.5 | 5-shot | |
| world_knowledge | college_mathematics | 0.31 | 5-shot | |
| world_knowledge | college_medicine | 0.66474 | 5-shot | |
| world_knowledge | college_physics | 0.411765 | 5-shot | |
| world_knowledge | computer_security | 0.72 | 5-shot | |
| world_knowledge | conceptual_physics | 0.582979 | 5-shot | |
| world_knowledge | econometrics | 0.473684 | 5-shot | |
| world_knowledge | electrical_engineering | 0.565517 | 5-shot | |
| world_knowledge | elementary_mathematics | 0.470899 | 5-shot | |
| world_knowledge | formal_logic | 0.460317 | 5-shot | |
| world_knowledge | global_facts | 0.33 | 5-shot | |
| world_knowledge | high_school_biology | 0.770968 | 5-shot | |
| world_knowledge | high_school_chemistry | 0.448276 | 5-shot | |
| world_knowledge | high_school_computer_science | 0.71 | 5-shot | |
| world_knowledge | high_school_european_history | 0.830303 | 5-shot | |
| world_knowledge | high_school_geography | 0.848485 | 5-shot | |
| world_knowledge | high_school_government_and_politics | 0.896373 | 5-shot | |
| world_knowledge | high_school_macroeconomics | 0.646154 | 5-shot | |
| world_knowledge | high_school_mathematics | 0.348148 | 5-shot | |
| world_knowledge | high_school_microeconomics | 0.722689 | 5-shot | |
| world_knowledge | high_school_physics | 0.344371 | 5-shot | |
| world_knowledge | high_school_psychology | 0.833028 | 5-shot | |
| world_knowledge | high_school_statistics | 0.523148 | 5-shot | |
| world_knowledge | high_school_us_history | 0.852941 | 5-shot | |
| world_knowledge | high_school_world_history | 0.827004 | 5-shot | |
| world_knowledge | human_aging | 0.713004 | 5-shot | |
| world_knowledge | human_sexuality | 0.755725 | 5-shot | |
| world_knowledge | international_law | 0.768595 | 5-shot | |
| world_knowledge | jurisprudence | 0.796296 | 5-shot | |
| world_knowledge | logical_fallacies | 0.723926 | 5-shot | |
| world_knowledge | machine_learning | 0.508929 | 5-shot | |
| world_knowledge | management | 0.825243 | 5-shot | |
| world_knowledge | marketing | 0.871795 | 5-shot | |
| world_knowledge | medical_genetics | 0.73 | 5-shot | |
| world_knowledge | miscellaneous | 0.814815 | 5-shot | |
| world_knowledge | moral_disputes | 0.736994 | 5-shot | |
| world_knowledge | moral_scenarios | 0.43352 | 5-shot | |
| world_knowledge | nutrition | 0.728758 | 5-shot | |
| world_knowledge | philosophy | 0.700965 | 5-shot | |
| world_knowledge | prehistory | 0.765432 | 5-shot | |
| world_knowledge | professional_accounting | 0.507092 | 5-shot | |
| world_knowledge | professional_law | 0.487614 | 5-shot | |
| world_knowledge | professional_medicine | 0.727941 | 5-shot | |
| world_knowledge | professional_psychology | 0.661765 | 5-shot | |
| world_knowledge | public_relations | 0.718182 | 5-shot | |
| world_knowledge | security_studies | 0.669388 | 5-shot | |
| world_knowledge | sociology | 0.81592 | 5-shot | |
| world_knowledge | us_foreign_policy | 0.89 | 5-shot | |
| world_knowledge | virology | 0.518072 | 5-shot | |
| world_knowledge | world_religions | 0.789474 | 5-shot | |
| symbolic_problem_solving | bigbench_dyck_languages | 0.458 | 5-shot | |
| language_understanding | winogrande | 0.826361 | 5-shot | |
| symbolic_problem_solving | agi_eval_lsat_ar | 0.269565 | 5-shot | |
| symbolic_problem_solving | simple_arithmetic_nospaces | 0.372 | 5-shot | |
| symbolic_problem_solving | simple_arithmetic_withspaces | 0.367 | 5-shot | |
| reading_comprehension | agi_eval_lsat_rc | 0.794776 | 5-shot | |
| reading_comprehension | agi_eval_lsat_lr | 0.641176 | 5-shot | |
| reading_comprehension | agi_eval_sat_en | 0.849515 | 5-shot | |
| world_knowledge | arc_challenge | 0.670648 | 25-shot | |
| commonsense_reasoning | openbook_qa | 0.56 | 10-shot | |
| language_understanding | hellaswag | 0.866461 | 10-shot | |
| bigbench_cs_algorithms | 0.652273 | 10-shot | ||
| symbolic_problem_solving | bigbench_elementary_math_qa | 0.392453 | 1-shot |
- Downloads last month
- 6