How to use SvdH/Vengeance-22B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="SvdH/Vengeance-22B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)
# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("SvdH/Vengeance-22B") model = AutoModelForCausalLM.from_pretrained("SvdH/Vengeance-22B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))
How to use SvdH/Vengeance-22B with vLLM:
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "SvdH/Vengeance-22B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SvdH/Vengeance-22B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'
docker model run hf.co/SvdH/Vengeance-22B
How to use SvdH/Vengeance-22B with SGLang:
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "SvdH/Vengeance-22B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SvdH/Vengeance-22B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "SvdH/Vengeance-22B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SvdH/Vengeance-22B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'
How to use SvdH/Vengeance-22B with Docker Model Runner:
Hi!
Would it be possible to give some information on this model and what were you trying to achieve with this? What is the aim of this merge? What is the outcome? What are some of the limitations?
· Sign up or log in to comment