Instructions to use Gensyn/Qwen2.5-1.5B-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Gensyn/Qwen2.5-1.5B-Instruct with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Gensyn/Qwen2.5-1.5B-Instruct")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Gensyn/Qwen2.5-1.5B-Instruct")
model = AutoModelForCausalLM.from_pretrained("Gensyn/Qwen2.5-1.5B-Instruct")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Gensyn/Qwen2.5-1.5B-Instruct with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Gensyn/Qwen2.5-1.5B-Instruct"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Gensyn/Qwen2.5-1.5B-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Gensyn/Qwen2.5-1.5B-Instruct

SGLang

How to use Gensyn/Qwen2.5-1.5B-Instruct with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Gensyn/Qwen2.5-1.5B-Instruct" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Gensyn/Qwen2.5-1.5B-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Gensyn/Qwen2.5-1.5B-Instruct" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Gensyn/Qwen2.5-1.5B-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Gensyn/Qwen2.5-1.5B-Instruct with Docker Model Runner:
```
docker model run hf.co/Gensyn/Qwen2.5-1.5B-Instruct
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Qwen2.5-1.5B-Instruct

Introduction

This model is intended for use in the Gensyn RL Swarm, to finetune locally using peer-to-peer reinforcement learning post-training.

Once finetuned, the model can be used as normal in any workflow, for details on how to do this please refer to the original model documentation.

For more details on the original model, please refer to the original repository here.

This repo contains an unmodified version of the instruction-tuned 1.5B Qwen2.5 model, which has the following features:

Type: Causal Language Models
Training Stage: Pretraining & Post-training
Architecture: transformers with RoPE, SwiGLU, RMSNorm, Attention QKV bias and tied word embeddings
Number of Parameters: 1.54B
Number of Paramaters (Non-Embedding): 1.31B
Number of Layers: 28
Number of Attention Heads (GQA): 12 for Q and 2 for KV
Context Length: Full 32,768 tokens and generation 8192 tokens

Requirements

This model is intended for use in the Gensyn RL Swarm system, for details on model requirements when using outside of a swarm, refer to the original Qwen repo here.

Quickstart

To deploy this model into a swarm and/or participate in the Gensyn Testnet, follow the instructions in the RL Swarm repository, read about the testnet, read the RL Swarm overview, and/or read the RL Swarm technical report.