Instructions to use ReasoningTransferability/UniReason-Qwen3-14B-RL with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use ReasoningTransferability/UniReason-Qwen3-14B-RL with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="ReasoningTransferability/UniReason-Qwen3-14B-RL")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("ReasoningTransferability/UniReason-Qwen3-14B-RL")
model = AutoModelForCausalLM.from_pretrained("ReasoningTransferability/UniReason-Qwen3-14B-RL")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use ReasoningTransferability/UniReason-Qwen3-14B-RL with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "ReasoningTransferability/UniReason-Qwen3-14B-RL"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ReasoningTransferability/UniReason-Qwen3-14B-RL",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/ReasoningTransferability/UniReason-Qwen3-14B-RL

SGLang

How to use ReasoningTransferability/UniReason-Qwen3-14B-RL with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "ReasoningTransferability/UniReason-Qwen3-14B-RL" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ReasoningTransferability/UniReason-Qwen3-14B-RL",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "ReasoningTransferability/UniReason-Qwen3-14B-RL" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ReasoningTransferability/UniReason-Qwen3-14B-RL",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use ReasoningTransferability/UniReason-Qwen3-14B-RL with Docker Model Runner:
```
docker model run hf.co/ReasoningTransferability/UniReason-Qwen3-14B-RL
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

UniReason-Qwen3-14B-RL

This model is associated with the research paper: "Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning"

📄 Paper: 2507.00432 💻 Code: https://github.com/ReasoningTransfer/Transferability-of-LLM-Reasoning

Abstract

Math reasoning has become the poster child of progress in large language models (LLMs), with new models rapidly surpassing human-level performance on benchmarks like MATH and AIME. But as math leaderboards improve week by week, it is worth asking: do these gains reflect broader problem-solving ability or just narrow overfitting?

Model Description

This model is a RL-GRPO-tuned version of qwen3-14b focused on math-reasoning capabilities. The model was developed as part of research investigating the transferability of mathematical reasoning skills to general language tasks.

Key Research Questions Addressed:

Does math reasoning training improve general LLM capabilities?
How do different training methods (RL vs SFT) affect transferability?
What is the trade-off between specialized math performance and general capabilities?

Model Details

Base Model: qwen3-14b
Training Method: RL-GRPO
Primary Focus: math-reasoning
Training Data: Math-specific datasets
Architecture: Transformer-based language model
Parameters: 14B

Training Details

Training Method: RL-GRPO

Custom training methodology - see paper for details.

Datasets Used

Mathematical reasoning datasets
See paper for complete dataset list

Performance

Math Reasoning Benchmarks

MATH: See paper
AIME: See paper

General Capabilities

General QA: See paper
Code Generation: See paper
Instruction Following: See paper

For detailed performance metrics, please refer to the paper.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load model and tokenizer
model_name = "ReasoningTransferability/UniReason-Qwen3-14B-RL"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)

# Example: Math reasoning
math_prompt = "Solve this step by step: What is the derivative of x^3 + 2x^2 - 5x + 1?"
inputs = tokenizer(math_prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=512, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

# Example: General reasoning
general_prompt = "Explain the concept of supply and demand in economics."
inputs = tokenizer(general_prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=512, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Limitations and Biases

Specialization Trade-offs: As explored in the paper, models optimized for math reasoning may show reduced performance on general tasks
Training Method Dependencies: Performance characteristics vary significantly between RL and SFT training approaches
Domain Transfer: The extent of capability transfer from math to other domains is limited
Computational Requirements: Model requires significant computational resources for inference

Research Findings

Key findings from the associated paper:

RL vs SFT: RL-tuned models show better transfer to general domains compared to SFT-tuned models
Capability Trade-offs: Most math-specialized models fail to transfer gains to other domains
Forgetting: SFT-tuned models often forget general capabilities during math-focused training

Ethical Considerations

This model is intended for research purposes
Users should be aware of potential biases in mathematical and general reasoning
The model should not be used for making critical decisions without human oversight
Consider the environmental impact of large model inference

Citation

If you use this model in your research, please cite both the model and the associated paper:

@misc{huan2025doesmathreasoningimprove,
      title={Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning}, 
      author={Maggie Huan and Yuetai Li and Tuney Zheng and Xiaoyu Xu and Seungone Kim and Minxin Du and Radha Poovendran and Graham Neubig and Xiang Yue},
      year={2025},
      eprint={2507.00432},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2507.00432}, 
}

Contact

For questions about this model or the associated research, please:

Open an issue in this repository
Contact the paper authors
Reference the original paper: https://arxiv.org/abs/2507.00432

Acknowledgments

This work builds upon the research presented in "Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning" and uses the qwen3-14b architecture as its foundation.

Model uploaded on 2025-07-03

Downloads last month: 15

Safetensors

Model size

15B params

Tensor type

BF16

Model tree for ReasoningTransferability/UniReason-Qwen3-14B-RL

Quantizations

2 models

Paper for ReasoningTransferability/UniReason-Qwen3-14B-RL

Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published Jul 1, 2025 • 79