Instructions to use DisOOM/Qwen1.5-22B-Chat-Merge with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use DisOOM/Qwen1.5-22B-Chat-Merge with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="DisOOM/Qwen1.5-22B-Chat-Merge") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("DisOOM/Qwen1.5-22B-Chat-Merge") model = AutoModelForCausalLM.from_pretrained("DisOOM/Qwen1.5-22B-Chat-Merge") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use DisOOM/Qwen1.5-22B-Chat-Merge with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "DisOOM/Qwen1.5-22B-Chat-Merge" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DisOOM/Qwen1.5-22B-Chat-Merge", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/DisOOM/Qwen1.5-22B-Chat-Merge
- SGLang
How to use DisOOM/Qwen1.5-22B-Chat-Merge with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "DisOOM/Qwen1.5-22B-Chat-Merge" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DisOOM/Qwen1.5-22B-Chat-Merge", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "DisOOM/Qwen1.5-22B-Chat-Merge" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DisOOM/Qwen1.5-22B-Chat-Merge", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use DisOOM/Qwen1.5-22B-Chat-Merge with Docker Model Runner:
docker model run hf.co/DisOOM/Qwen1.5-22B-Chat-Merge
Qwen1.5-22B-Chat-Merge
--This is a 22b frankenmerge of qwen1.5-14B-Chat created by interleaving layers of qwen1.5-14B-Chat with itself using mergekit.--
Due to the current absence of intermediary-sized models between 14B and 72B in the Qwen1.5 series, I am trying to make some middle-sized models, such as those with 20B+ and 30B+ parameters, through a merging approach. This initiative aims to enable more individual users to maximize the utilization of their hardware capabilities.
-Quantize
GGUF Here:gguf
-Merge Configuration
This yaml below:
dtype: float16
merge_method: passthrough
slices:
- sources:
- layer_range: [0, 5]
model: Qwen/Qwen1.5-14B-Chat
- sources:
- layer_range: [5, 15]
model: Qwen/Qwen1.5-14B-Chat
- sources:
- layer_range: [10, 20]
model: Qwen/Qwen1.5-14B-Chat
- sources:
- layer_range: [15, 25]
model: Qwen/Qwen1.5-14B-Chat
- sources:
- layer_range: [20, 30]
model: Qwen/Qwen1.5-14B-Chat
- sources:
- layer_range: [25, 35]
model: Qwen/Qwen1.5-14B-Chat
- sources:
- layer_range: [30, 40]
model: Qwen/Qwen1.5-14B-Chat
-Performance
- Tips:I don't have the capability to conduct benchmark tests, nor can I even use it extensively enough, so my test results might not be accurate.
It has better performance than the 14B version in most of my own tests (subjective) including comprehension, reasoning and coherence and also writing skills. If you believe in this model's performance, feel free to test it out or offer evaluations. Everyone's tests or evaluations are welcome.
- Downloads last month
- 1