Instructions to use Yhyu13/LMCocktail-Mistral-7B-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Yhyu13/LMCocktail-Mistral-7B-v1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Yhyu13/LMCocktail-Mistral-7B-v1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Yhyu13/LMCocktail-Mistral-7B-v1")
model = AutoModelForCausalLM.from_pretrained("Yhyu13/LMCocktail-Mistral-7B-v1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Yhyu13/LMCocktail-Mistral-7B-v1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Yhyu13/LMCocktail-Mistral-7B-v1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Yhyu13/LMCocktail-Mistral-7B-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Yhyu13/LMCocktail-Mistral-7B-v1

SGLang

How to use Yhyu13/LMCocktail-Mistral-7B-v1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Yhyu13/LMCocktail-Mistral-7B-v1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Yhyu13/LMCocktail-Mistral-7B-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Yhyu13/LMCocktail-Mistral-7B-v1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Yhyu13/LMCocktail-Mistral-7B-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Yhyu13/LMCocktail-Mistral-7B-v1 with Docker Model Runner:
```
docker model run hf.co/Yhyu13/LMCocktail-Mistral-7B-v1
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

LM-cocktail Mistral 7B v1

This is a 50%-50% model of two best Mistral models

https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2

https://huggingface.co/xDAN-AI/xDAN-L1-Chat-RL-v1

both claimed to be better than chatgpt-3.5-turbo in almost all metrics.

Alpaca Eval

I am thrilled to announce that ChatGPT has ranked LMCocktail 7B as the second best model next to GPT4 on AlpcaEval in my local community run, even greater than my previously best LMCocktail-10.7B-v1 model. You can also check the leaderboard at ./Alpaca_eval/chatgpt_fn_--LMCocktail-Mistral-7B-v1/

                        win_rate  standard_error  n_total  avg_length
gpt4                       73.79            1.54      805        1365
LMCocktail-7B-v1(new)      73.54            1.55      805        1870
LMCocktail-10.7B-v1(new)   73.45            1.56      804        1203
claude                     70.37            1.60      805        1082
chatgpt                    66.09            1.66      805         811
wizardlm-13b               65.16            1.67      805         985
vicuna-13b                 64.10            1.69      805        1037
guanaco-65b                62.36            1.71      805        1249
oasst-rlhf-llama-33b       62.05            1.71      805        1079
alpaca-farm-ppo-human      60.25            1.72      805         803
falcon-40b-instruct        56.52            1.74      805         662
text_davinci_003           50.00            0.00      805         307
alpaca-7b                  45.22            1.74      805         396
text_davinci_001           28.07            1.56      805         296

Code

The LM-cocktail is novel technique for merging multiple models https://arxiv.org/abs/2311.13534

Code is backed up by this repo https://github.com/FlagOpen/FlagEmbedding.git

Merging scripts available under the ./scripts folder

Downloads last month: 933

Model tree for Yhyu13/LMCocktail-Mistral-7B-v1

Quantizations

1 model

Spaces using Yhyu13/LMCocktail-Mistral-7B-v1 9

Paper for Yhyu13/LMCocktail-Mistral-7B-v1

LM-Cocktail: Resilient Tuning of Language Models via Model Merging

Paper • 2311.13534 • Published Nov 22, 2023 • 3